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FOREWORD 


This document contains the 2002 revised statistical standards and guidelines for the 
National Center for Education Statistics (NCES). The Center’s primary goal is that of 
providing high quality, reliable, useful, and informative statistical information about the 
condition of education in the United States and other countries. The standards and 
guidelines in this document — which are continually being reviewed to reflect statistical 
advances and, in some cases, organizational changes — will enable the Center to meet 
this goal. In addition, the standards present a clear statement for data users about how 
data should be collected in NCES surveys, and the limits of acceptable applications and 
use. 

The standards and guidelines represent the Center’s commitment as a federal statistical 
agency to providing quality in all of its activities. We hope that other agencies and 
organizations involved in statistical activities will find this document useful in their 
work. The NCES statistical standards are also available on the NCES web site 
(http://nces.ed.gov') . 


Val Plisko 

Associate Commissioner for 

Early Childhood, International, and Crosscutting Studies 
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INTRODUCTION 


Purpose of Statistical Standards 

This document contains the 2002 revised statistical standards and guidelines for 
the National Center for Education Statistics (NCES), the principal statistical agency 
within the U.S. Department of Education. Our primary goal is to provide high quality, 
reliable, useful, and informative statistical information to public policy decisionmakers 
and to the general public. Thus, most of these standards and guidelines are primarily 
geared toward fulfilling that goal. In particular, the standards and guidelines that 
follow are intended for use by NCES staff and contractors to guide them in their data 
collection, analysis, and dissemination activities. These standards and guidelines are 
also intended to present a clear statement for data users regarding how data should be 
collected in NCES surveys, and the limits of acceptable applications and use. Beyond 
these immediate uses, we hope that other organizations involved in similar public 
endeavors will find the contents of some of these standards and guidelines useful in 
their work as well. To that end, Chart A (see page 11) displays the organizational 
structure of NCES in an effort to help those less familiar with NCES understand some 
of the relationships that are present in many of the internal review processes that are 
described in the standards and guidelines. All users of these standards and guidelines 
should be cognizant of the fact that the contents of this document are continually being 
reviewed for technological and statistical advances. 


Background of Statistical Standards 

Data quality is the cornerstone of all official statistics programs. To this end, 
there are a number of international and national groups that have devoted considerable 
time and effort to delineating important concepts and principles for official statistics. 
On the international front, the United Nations (UN) and the Economic Commission for 
Europe (ECE) have both adopted a set of “Fundamental Principles of Official 
Statistics.” Included among the 10 principles are calls for statistical agencies to use 
professional standards that are based on scientific principles to guide the methods and 
procedures for the collection, processing, storage, and presentation of statistical data. 
The principles also call for the inclusion of relevant information on the sources, 
methods, and procedures of the statistics. In a similar vein, one of the main objectives 
identified by the Statistics Directorate of the Organization for Economic Co-operation 
and Development (OECD) includes the development of international statistical 
standards, systems, and collaborations. Similarly, the International Monetary Fund’s 
(IMF) data dissemination standard includes the integrity and quality of data, coverage, 
periodicity and timeliness, public access to data, and full documentation of the data 
collection. 

In the United States, there are two national committees that have each been 
working for more than a quarter of a century to improve statistical methods and data 
quality — the Federal Committee on Statistical Methodology (FCSM) and the 
Committee on National Statistics (CNSTAT). The Office of Management and Budget 
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(OMB) convenes the Federal Committee to provide a forum for communicating and 
disseminating information about statistical practices among all federal statistical 
agencies. The FCSM also recommends the introduction of new methodologies in 
federal statistical programs to improve data quality. 

The National Research Council of the National Academy of Sciences convenes 
CNSTAT, a committee of prominent researchers from universities and private research 
organizations, to study statistical topics to improve the effectiveness of the federal 
statistical system. CNSTAT monitors the statistical policy and coordinating activities 
of the federal government, reviews the statistical programs of federal agencies and 
suggests improvements, reviews data handling and privacy and confidentiality policies 
and provides recommendations for best practices, studies data gaps and recommends 
additions as necessary, and reviews extant methodologies and suggests improved 
statistical methods. 

CNSTAT published a monograph on the “Principles and Practices for a Federal 
Agency” to assist federal statistical agencies. The main principles include relevance of 
data, credibility among data users, confidentiality of data, and trust among data 
providers. Many of the practices identified parallel the “Fundamental Principles of 
Official Statistics” promulgated by the UN and the ECE. For example, statistical 
agencies should have a commitment to high quality and professional standards. In 
discussing openness about the data, CNSTAT stresses the importance of providing a 
full description of the data, the methods used, and assumptions made. The description 
should include reliable indicators of the kinds and amount of errors in the data. 
CNSTAT also stressed the importance of wide dissemination of data presented in a 
user-friendly format. The CNSTAT guide was one of the tools used by NCES staff in 
planning their current revision of the agency’s statistical standards. 


Development of Statistical Standards at NCES 

NCES first adopted written statistical standards in the spring of 1987. These 
standards were the result of a multi-year evaluation and planning process that included 
a recommendation for the development of statistical standards from the Committee on 
National Statistics at the National Academy of Sciences. With that recommendation, a 
statistical standards program was initiated at NCES in 1985. Using the Energy 
Information Administration’s Standards Manual and the Census Bureau’s technical 
paper on “Standards for Discussion and Presentation of Errors in Survey and Census 
Data,” NCES staff, in consultation with outside experts, developed the 1987 version of 
NCES statistical standards. 

With the adoption of this first set of standards, the Agency Director called for a 
formal evaluation to start the following fall, to insure that the standards were fully 
implemented and to identify any difficulties with the standards. In 1989, the Center 
undertook a full-scale revision of the 1987 standards. The revisions were developed by 
NCES staff and reflected their firsthand experiences in using the 1987 standards. After 
multiple reviews of interim drafts by NCES staff and the NCES Advisory Council of 
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Education Statistics, NCES Senior Staff accepted the revised standards in the spring of 
1992. 


At the June 1992 release of the NCES Statistical Standards report, the Acting 
Commissioner summarized the standards in the following statement: 

They: (1) codify how we expect to behave professionally, (2) indicate the 
basis on which we expect to be judged by our peers in the statistical 
community, (3) represent the quality we expect in any of our efforts or those 
of our contractors and grantees, (4) provide a means to assure consistency 
among the studies the Center conducts, and (5) document for users, the 
methods and principles the Center employs in the collection of data. 

The Acting Commissioner also reiterated the Center’s commitment to periodic 
evaluations of the implementation of the standards and to a periodic review of the 
standards’ operational feasibility. 

The current revision process began in the summer of 1999 with a review of 
existing standards from a number of national and international statistical policy 
agencies and committees and from other international and national statistical agencies. 
At the same time, the 1992 NCES Statistical Standards were made available on the 
web, and NCES staff were given a 30-day period to submit comments concerning 
potential revisions and additions to the NCES standards. Following these activities, an 
agency-wide Steering Committee was formed to work on the standards revision 
process. The Steering Committee formed 15 Working Groups that comprised more 
than one -half of the NCES staff to work on the set of topics identified in the 1999 
reviews. 

Each Working Group drafted their assigned standards, each of which underwent 
a multi-step review process. Following a 30-day NCES staff comment period, the 
working group members made revisions, and the Steering Committee reviewed the 
drafts and submitted them to Senior Staff. The drafts were then reviewed by Senior 
Staff, modified as necessary, and then shared with a group of 40 to 50 representatives 
of the contractors who work with NCES on data collection, analysis, and dissemination. 
Additional revisions were incorporated following the input from this broad group. 
NCES also commissioned the National Institute of Statistical Sciences to convene an 
independent review panel of statistical experts to review and comment on the draft 
standards prior to final acceptance by the Steering Committee and Senior Management. 
The standards in this document are the result of the efforts of the many persons who 
participated in this multi-stage review process but, ultimately, NCES takes 
responsibility for any lack of clarity or completeness. 

During the recent NCES standards revision, the Office of Management and 
Budget (OMB) issued government-wide guidelines for ensuring and maximizing the 
quality of information disseminated by federal agencies. The OMB guidelines direct 
all agencies covered by the Paperwork Reduction Act (44 U.S.C. chapter 35) to develop 
and implement procedures for reviewing and substantiating the quality of information 
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disseminated by the agency. In order to meet these goals, each agency is required to 
develop and promulgate quality guidelines. 

In response to the OMB guidelines, the federal statistical agencies collaborated 
to identify a set of activities that are essential to maintaining the quality and credibility 
of statistical data. The NCES revised standards are organized around the shared 
framework for federal statistical agencies. NCES remains committed to the principles 
outlined by the 1992 NCES Acting Commissioner; what is more, these principles are 
reaffirmed in the OMB call for data quality guidelines. 


OMB Quality Guidelines 
Background 

Section 515 of the Treasury and General Government Appropriations Act for 
Fiscal Year 2001 (Public Law 106-554) directed the U.S. Office of Management and 
Budget (OMB) to issue government-wide guidelines that “provide policy and 
procedural guidance to Federal agencies for ensuring and maximizing the quality, 
objectivity, utility, and integrity of infonnation (including statistical infonnation) 
disseminated by Federal agencies.” Information, as defined by OMB, includes any 
communication or representation of knowledge, such as facts or data, in any medium or 
form, including textual, numerical, graphic, cartographic, narrative or audiovisual 
forms. Dissemination refers to any agency-initiated or sponsored distribution of 
information to the public (OMB, Guidelines for Ensuring and Maximizing the Quality, 
Objectivity, Utility, and Integrity of Information Disseminated by Federal Agencies, 
February 22, 2002, 67 FR 8452-8460). 

NCES provides the public with a wide variety of information about the 
condition of American education. Information quality is important to NCES because 
educators, researchers, policymakers, and the public use NCES products for a variety of 
purposes. Thus, it is important that infonnation products that NCES disseminates are 
accurate and reliable. Most of the information products are available both as printed 
and electronic documents. They are announced on the NCES web site ( nces.ed.gov) , 
and most electronic versions can be accessed and downloaded directly from the web 
site. 

Purpose and Scope 

NCES guidelines have been identified as Standards for the last 15 years, thus 
we will retain that label. The purpose of these Standards is to describe NCES policy 
and procedures for reviewing and substantiating the quality of information before it is 
disseminated. These Standards are consistent with those issued by OMB and the 
Department of Education. These Standards represent a performance goal for NCES and 
are intended to improve the quality of the infonnation NCES shares with the public. 

In addition to the NCES Standards, the Department of Education and OMB 
have more general Infonnation Quality Guidelines that apply to NCES. What is more, 
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NCES will follow the request for corrections and appeal process described in the 
Department’s Information Quality Guidelines 
(www.ed.gov/offices/OCIO/info quantity/info guide.html) . 

The Standards are applicable to any information that NCES disseminates after 
October 1, 2002. In addition, some previously released information products continue 
to be used for decisionmaking or are relied upon by the Department of Education and 
the public as official, authoritative, government data; these data are, in effect, 
constantly being redisseminated and thus are subject to these Standards and to the 
Department and OMB Information Quality Guidelines. Previously released information 
products that do not meet these criteria are considered archived information and thus 
are not subject to the Guidelines. 

In addition to archived reports, these Standards do not cover all other 
information held or disseminated by NCES. The Department of Education Information 
Quality Guidelines include a list of excluded items; although that list also applies to 
NCES, the items that are particularly relevant to NCES are included here. For example, 
the guidelines generally do not cover the following: internal information such as 
employee records; internal procedural, operational, or policy manuals prepared for the 
management and operations of the Department of Education (and NCES) that are not 
primarily intended for public dissemination; information collected or developed by 
NCES that is not disseminated to the public, including documents intended only for 
inter-agency or intra-agency communications; opinions that are clearly identified as 
such, and that do not represent facts or NCES views; correspondence with individuals; 
comments received from the public in response to Federal Register notices; electronic 
li nk s to information on other web sites; and research findings published by NCES data 
cooperatives or grantees, unless NCES represents or uses the information as the official 
position of the Department, or in support of the official position of the Department, or 
has authority to review and approve the information before release. 

For infonnation covered by Information Quality Guidelines, the NCES 
Standards provide a basic standard of quality that can be defined based on the three 
elements of quality as defined by OMB: utility, objectivity, and integrity. These 
elements are intended to ensure that information disseminated by NCES is useful, 
accurate, reliable, unbiased, and secure. 


Framework 

Utility refers to the usefulness of the information to its intended users. The 
usefulness of information disseminated by NCES shoidd be considered from the 
perspective of NCES, educators, education researchers, policymakers, and the public. 
Utility is achieved by staying informed of information needs and developing new 
products and services where appropriate. 

NCES wants to ensure that information it disseminates meets the needs of the 
intended users. NCES relies upon internal reviews and analyses, along with feedback 
from advisory committees, educators, education researchers, policymakers, and the 
public, to ensure that information disseminated by NCES meets the needs of intended 
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users. In addition, all information products should be grammatically correct and clearly 
written in plain English. The target audience should be clearly identified, and the 
product should be understandable to that audience. 

Consistent with OMB guidance, the goal is to maximize the usefulness of 
information and minimize the cost to the government and the public. When 
disseminating its information products, NCES will utilize all feasible and available 
dissemination channels so that the public, education researchers, and policymakers can 
locate NCES infonnation in an equitable and timely fashion. 

The information disseminated by NCES includes administrative and statistical 
data. NCES collects and disseminates administrative data from universe collections of 
elementary and secondary and postsecondary institutions. These universe collections 
are based on reports aggregated from records from schools, school districts, and states. 
NCES also collects and disseminates data from a number of sample survey data 
collections that are designed to fill the information needs for statistical data. NCES 
supports both ongoing sample survey data collections and special purpose surveys that 
are designed to fill data gaps or information needs that are identified through internal 
review, legislative mandates, or input from data users outside the Department. All 
statistical reports and related products are reviewed to ensure their usefulness to the 
intended users. Where appropriate, contact infonnation is available on each publication 
to facilitate feedback and questions by users. 

The specific NCES standards that contribute directly to the utility and the 
dissemination of information include those on the Initial Planning of Surveys (1-1), 
Publication and Product Planning (1-2), and the Release and Dissemination of Reports 
and Data Products (7-3). 

Objectivity refers to whether information is accurate, reliable, unbiased, and is 
presented in an accurate, clear, and unbiased manner. It involves both the content of 
the information and the presentation of the information. This includes complete, 
accurate, and easily understood documentation of the source of the information, with a 
description of the sources of any errors that may affect the quality of the data, when 
appropriate. Objectivity is achieved by using reliable information sources and 
appropriate techniques to prepare information products. 

NCES strives to present infonnation to the public in an accurate, clear, 
complete, and unbiased manner. Prior to dissemination to the public, all products are 
reviewed for objectivity using sound statistical methods and the principles of 
transparency and reproducibility, as delineated in the OMB Infonnation Quality 
Guidelines. In addition, all products undergo editorial and technical peer review to 
assist NCES in meeting this goal. 

NCES is committed to the principles for objectivity in administrative and 
statistical data that are outlined in the Department of Education’s Guidelines. To that 
end, we have specific standards that relate to each of the Department’s principles: 
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1. In formulating a data collection plan, goals of the study should be clearly 
described — Initial Planning of Surveys (1-1), Design of Surveys (2-1), 
Developing a Request for Proposals (RFP) for Surveys (2-3). 

2. The subjects to be studied and the data to be collected should be clearly defined, 
using broadly understood concepts and definitions — Initial Planning of Surveys 
(1-1), Codes and Abbreviations (1-4), Defining Race and Ethnicity Data (1-5), 
Design of Surveys (2-1), Developing a Request for Proposals (RFP) for Surveys 
(2-3), Maintaining Data Series Over Time (2-5). 

3. The data collection techniques should be well thought out, clearly articulated, 
and designed to use state-of-the-art methodologies in the data collection — Initial 
Planning of Surveys (1-1), Design of Surveys (2-1), Survey Response Rate 
Parameters (2-2), Developing a Request for Proposals (RFP) for Surveys (2-3), 
Pretesting Survey Systems (2-4), Educational Testing (2-6), Coverage for 
Frames and Samples (3-1), Achieving Acceptable Response Rates (3-2), 
Monitoring and Documenting Survey Contracts (3-3). 

4. In designing the work, every effort should be made to minimize the amount of 
time required for survey participants — Achieving Acceptable Response Rates 
(3-2). 

5. The source of data should be reliable. In the case of sample survey data, the 
sample should be drawn from a complete list of items to be tested or evaluated, 
the appropriate respondents must be identified, correctly sampled, and queried 
with survey instruments that have been properly developed and tested — Initial 
Planning of Surveys (1-1), Design of Surveys (2-1), Pretesting Survey Systems 
(2-4), Coverage for Frames and Samples (3-1). 

6. Response rates should be monitored during data collection. When necessary, 
appropriate steps should be taken to ensure the respondents are a representative 
sample — Computation of Response Rates (1-3), Survey Response Rate 
Parameters (2-2), Achieving Acceptable Response Rates (3-2), Monitoring and 
Documenting Survey Contracts (3-3), Nonresponse Bias Analysis (4-4). 

7. Care should be taken to ensure the confidentiality of personally identifiable 
data, as required by law, during data collection, processing, and analysis of the 
resulting data — Maintaining Confidentiality (4-2). 

8. Upon completion of the work, the data should be processed in a manner 
sufficient to ensure that the data are cleaned and edited to help ensure that the 
data are accurate and reliable — Initial Planning of Surveys (1-1), Design of 
Surveys (2-1), Monitoring and Documenting Survey Contracts (3-3), Data 
Editing and Imputation of Item Nonresponse (4-1), Evaluation of Surveys (4-3). 

9. The data collection should be properly documented and stored, and the 
documentation should include an evaluation of the quality of the data with a 
description of any limitations of the data — Monitoring and Documenting 
Survey Contracts (3-3), Documenting a Survey System (3-4), Machine 
Readable Products (7-1). 

10. Data should be capable of being reproduced or replicated based on information 
included in the documentation including, for example: 
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a) The source(s) of the information; 

b) The date the infonnation was current; 

c) Any known limitations on the information; 

d) The reason why the information is provided; 

e) Descriptions of any statistical techniques or mathematical operations applied 
to source data; and 

f) Identification of other sources of potentially corroborating or conflicting 
information. 

The relevant standards include — Monitoring and Documenting Survey 
Contracts (3-3), Documenting a Survey System (3-4), Machine Readable 
Products (7-1), Survey Documentation in Reports (7-2). 

1 1 . If secondary analysis of data is employed, the source should be acknowledged, 
the reliability of the data should be confirmed and documented, and any 
shortcomings or explicit errors should be acknowledged (e.g., the 
representativeness of the data, measurement error, data preparation error, 
processing error, sampling errors, and nonresponse errors) — Survey 
Documentation in Reports (7-2). 

12. The analysis should be selected and implemented to ensure that the data are 
correctly analyzed using modem statistical techniques suitable for hypothesis 
testing. Techniques may vary from simple tabulations and descriptive analysis 
to multivariate analysis of complex interrelationships. Care should be taken to 
ensure that the techniques are appropriate for the data and the questions under 
inquiry — Statistical Analysis, Inference, and Comparison (5-1), Variance 
Estimation (5-2), Rounding (5-3), Tabular and Graphic Presentations (5-4). 

13. Reports should also include the reason the information is provided, its potential 
uses, cautions as to inappropriate extractions or conclusions, and the 
identification of other sources of corroborating or conflicting information — 
Survey Documentation in Reports (7-2). 

14. Descriptions of the data and all analytical work should be reported in sufficient 
detail to ensure that the findings could be reproduced using the same data and 
methods of analysis; this includes the preservation of the data set used to 
produce the work — Monitoring and Documenting Survey Contracts (3-3), 
Documenting a Survey System (3-4), Evaluation of Surveys (4-3), Machine 
Readable Products (7-1), Survey Documentation in Reports (7-2). 

15. All reports, data, and documentation should undergo editorial and technical 
review to ensure accuracy and clarity prior to dissemination. Qualified 
technical staff and peers outside the Department should do the technical 
review — Review of Reports and Data Products (6-1). 

16. To ensure the utility of the work, all work must be conducted and released in a 
timely manner — Publication and Product Planning (1-2), Release and 
Dissemination of Reports and Data Products (7-3). 

17. There should be established procedures to correct any identified errors. These 
procedures may include the publication of errata sheets, revised publications, or 



Web postings — Review of Reports and Data Products (6-1), Release and 
Dissemination of Reports and Data Products (7-3). 


Integrity refers to the security or protection of information from unauthorized 
access or revision. Integrity ensures that the information is not compromised through 
corruption or falsification. 

NCES has in place appropriate security provisions for the protection of 
confidential information that is contained in all identified systems of records. In 
accordance with statutory and administrative provisions governing the protection of 
information, NCES protects administrative records and sample survey data that include 
personally identifiable information, especially survey data that are collected under a 
pledge of confidentiality. Applicable provisions governing the protection of 
information include the following: 

• Privacy Act; 

• Computer Security Act of 1 987 ; 

• Freedom of Information Act; 

• OMB Circulars A-123, A- 127, and A-130; 

• Federal Policy for the Protection of Human Subjects; 

• Government Information Security Reform Act; and 

• National Education Statistics Act, as amended by the USA Patriot Act of 
2001 . 

The relevant standard is Maintaining Confidentiality (4-2). 

Influential Information 

The OMB guidelines for implementing section 515 recognize that some 
government information needs to meet higher quality standards than a basic standard of 
quality. The level of effort required to ensure the quality of information is tied to the 
uses of the information. Information that is defined as “influential” requires a higher 
level of effort to ensure its quality and reproducibility. Scientific, financial, and 
statistical information is considered influential if the Department can reasonably 
determine that the information is likely to have a clear and substantial impact on 
important public policies or private sector decisions if disseminated. 

Influential information must be accompanied by supporting documentation that 
allows an external user to clearly understand the steps involved in producing the 
information and to be able to reproduce the information. Any influential original data 
files must describe the design, collection, and processing of the data in sufficient detail 
that an interested third party could understand the specifics of the original data and, if 
necessary, independently replicate the data collection. In the case of influential analytic 
results, the mathematical and statistical processes used to produce the report must be 
described in sufficient detail to allow an independent analyst to substantially reproduce 
the findings using the original data and identical methods. 

When full public access to NCES data and methods is not possible due to other 
compelling interests, NCES will apply especially rigorous robustness checks to analytic 
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results and will document the checks that were undertaken. In those cases where 
protecting the confidentiality of individually identifiable data precludes the full release 
of a data file, persons seeking access to such data and methods are required to follow 
applicable NCES requirements and procedures for seeking such access. In all cases, 
the interest in transparency of the agency’s data shall not override other compelling 
interests, such as privacy, intellectual property, and other confidentiality protections (16 
CFR4.9-4.il and OMB Guidelines, par V.b.3.ii.B.j.). 

Inasmuch as it is not always possible to predict in advance all of the uses of the 
information included in NCES data collections, all information collected and 
disseminated by NCES is held to the standards of quality, reproducibility, and 
documentation that are required for influential information. 
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SUBJECT: INITIAL PLANNING OF SURVEYS 


NCES STANDARD: 1-1 

PURPOSE: To provide an initial planning document that includes the information 
required for a decision on whether or not to proceed with the preliminary design and 
implementation plans of a specific survey or survey system. 

KEY TERMS: assessment, design effect, effect size, effective sample size, key 
variables, minimum substantively significant effect (MSSE), planning document, 
power, response rate, survey, and survey system. 

STANDARD 1-1-1: The initial plan for developing a survey or survey system must 
include the justification for the study and must describe the survey methodology. Prior 
to an OMB fiscal year budget request for data collection, the initial planning document 
must be presented to the OC/ODC for review and a decision on whether to proceed 
with the design phase. The initial planning document must include the following: 

1 . A justification for the survey, including the rationale for the survey, the goals and 
objectives, and related hypotheses to be tested. This justification must include 
evidence that consultations with potential users have occurred. 

2. A review of related studies, surveys, and reports of federal and nonfederal sources 
to ensure that part or all of the data are not available from an existing source, or 
could not be more appropriately obtained by adding questions to existing surveys 
sponsored by NCES or other agencies. The goal here is to minimize respondent 
burden. If a new survey is needed, efforts should be made in the development of 
the questionnaire and any assessment items to minimize the burden to individual 
respondents. 

3. Surveys that involve interviewing students in elementary and secondary schools 
must adhere to the requirements of the Protection of Pupil Rights Act and related 
amendments (see 20 U.S.C. 1232h and amendments included in Section 1061 of the 
No Child Left Behind Act of 2001). Specifically, without written consent from a 
student’s parent, questions may not be asked about the following: 

a. Political affiliations or beliefs of the student or the student’s parent; 

b. Mental or psychological problems of the student or the student’s family; 

c. Sex behavior or attitudes; 

d. Illegal, antisocial, self-incriminating, or demeaning behavior; 

e. Critical appraisals of other individuals with whom respondents have close 
family relationships; 

f. Legally recognized privileged or analogous relationships, such as those of 
lawyers, physicians, and ministers; 

g. Religious practices, affiliations, or beliefs of the student or the student’s parent; 
or 
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h. Income (other than that required by law to detennine eligibility for participation 
or for receiving financial assistance under such a program). 

In addition, the confidentiality and privacy provisions of the Privacy Act and the 
Education Sciences Refonn Act of 2002 must be taken into account in designing 
any studies that will collect individually identifiable data from any survey 
participants (see Standard 4-2). 

4. A preliminary survey design that discusses the proposed target population, response 
rate goals (see Standard 1-3), sample design, sample size detennination based on 
power analyses for the MSSEs for key variables, data collection methods, and 
methodological issues. 

5. A preliminary analysis plan that identifies analysis issues, objectives, key variables, 
minimum substantively significant effect sizes, and proposed statistical techniques. 

6. A list of data items that will be maintained over time as part of an NCES data 
series, including the justification for each item. 

7. A preliminary time schedule that accounts for the complete survey cycle from 
planning to data release. 

8. A preliminary publication and dissemination plan that identifies proposed major 
publications and their target audiences (see Standard 1-2). 

9. A preliminary survey evaluation plan that identifies the proposed analyses 
necessary for data users to understand the quality and limitations of the survey (see 
Standard 4-3). 

10. An internal cost estimate that reflects all of the above items. 
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SUBJECT: PUBLICATION AND PRODUCT PLANNING 


NCES STANDARD: 1-2 

PURPOSE: To ensure that all proposed NCES products are included in an annual 
NCES publication plan that will assist with the coordination of publications across 
divisions, in an effort to avoid duplication and to maximize collaboration. The 
publication plan will make explicit the status of all anticipated publications for the next 
year; provide target dates for all mandatory and required publications; and assure that 
appropriate attention is given to all necessary aspects of the planning process. 


STANDARD 1-2-1: All NCES publications and data products must be included in the 
annual NCES publication plan. This includes mandatory, required, and projected 
publications. 

1 . Mandatory publications include a limited number of high profile reports that the 
agency is committed to release in a specific month. 

2. Required publications are those that are scheduled for release within the fiscal year, 
including most first releases from NCES data collections, including data files, CD- 
ROMs, and electronic codebooks. 

3. Projected publications are those that may be completed during the year, but for 
which there is no predetennined expectation about a release date. These are staff- 
initiated in-depth reports and publications over which the agency has less control 
over timing. 

(See List 1-1-A for a description of NCES product types, Standard 7-2 for a description 
of content requirements by product type, and Standard 6-1 for the type of review 
required by product type.) 

GUIDELINE 1-2-1A: A publication should be added to the publication plan by the 
time it is signed off by the Program Director for Division review. 

GUIDELINE 1-2-1B: Project Directors should update changes in the NCES 
publication plan on an as-needed basis. 


STANDARD 1-2-2: All proposed publications and data products must receive Program 
Director and Associate Commissioner approval before inclusion in the NCES 
publication plan. 

GUIDELINE 1-2-2A: Bimonthly meetings between Office of the Commissioner 
(OC) publications staff and the Associate Commissioners and their division staff 
should be held to review progress on the publication plan. 
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STANDARD 1-2-3: All mandatory and required publications must have firm target 
delivery dates to the OC Publication Database Coordinator for distribution for peer 
review. 

GUIDELINE 1-2-3A: The date for printed release is approximately 6 to 8 weeks 
after the final post-adjudication sign-off by the Chief Statistician. 

GUIDELINE 1-2-3B: For web release only publications, the release date can be 
simultaneous with the post-adjudication sign-off, but should occur within 1 week. 


STANDARD 1-2-4: For printed release publications, the reports will not be sent to 
GPO until the PDF file and the web publishing form are submitted to the Webmaster. 
For early web release publications, the PDF file will be posted on the web and sent to 
the publications section for review when the OC approves the release. If changes are 
needed as a result of the publications section review, the author is responsible for 
correcting the PDF. (For additional information about web publishing, contact: 
NCESWebmaster@ed.gov.) 


STANDARD 1-2-5: All analytic, descriptive, and research and development 
publications must have a written analysis plan approved by the Program Director prior 
to beginning an analysis. 

GUIDELINE 1-2-5A: The analysis plan should be developed in consultation with 
the Associate Commissioner and the Chief Statistician. 
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LIST 1-1-A. NCES PUBLICATION TYPES 


Brochure/Pamphlets present an overview of NCES programs or surveys. 

CD-ROMs present NCES data and related documentation. Products include microdata 
files, documentation for microdata files, data embedded in data analysis systems, and 
data in electronic tabulations. 

Compendia are comprehensive resource publications that summarize major education 
statistics on the status and progress of education at one or more levels of education 
from preprimary through graduate education, adult education, and lifelong learning. 

Conference Reports are compilations of papers presented at NCES-sponsored 
conferences and workshops. 

Data Files present NCES data and related documentation. Products include microdata 
files and documentation for micro-data files. 

Directories typically present listings of educational institutions and agencies. 

E.D. TABs are a collection of tables, presented with minimal analyses. The purpose of 
an E.D. TAB is to make tabular data available quickly. 

Guides provide descriptions of data collection programs and manuals of procedures 
which describe how to complete the activity. 

Handbooks provide descriptions of procedures and recommendations for best 
practices. 

Issue Briefs are a two- to four-page summary of a particular topic. A limited number of 
tables and charts are presented with descriptive text intended to provide a quick view of 
a current topic. 

Questionnaires/Glossaries are copies of questionnaires and glossaries from selected 
NCES data collections. 

Research and Development (R&D) Reports are detailed reports of emerging issues, 
state-of-the-art analytic approaches, and new software applications. The findings 
reported in developmental work are subject to revision as the work continues and 
additional data become available. 

Statistical Analysis Reports present an overview of results from one survey, or from 
one topic based on analysis across several surveys. The data and findings are presented 
with commentary to identify substantively and statistically significant results, and their 
relationship to educational research. 

Statistics in Brief are a short, focused analysis of a specific topic. Generally 4 to 15 
pages in length, these reports are designed to provide data on policy-relevant topics. 

Technical/Methodological Reports are an in-depth analysis of analytic methods, 
survey design, survey procedures, or data quality issues. 
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User's Manuals/Data File Documentation present information on NCES data and 
related documentation. 

Videotapes are VHS formatted tapes of survey findings, case studies, or best practices. 

Working Papers provide preliminary analysis of substantive, technical, and 
methodological issues. They are works in progress that are presented to promote the 
sharing of valuable work experience and knowledge. These papers have not undergone 
a rigorous review for consistency with NCES standards. 
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SUBJECT: COMPUTATION AND REPORTING OF RESPONSE RATES 
NCES STANDARD: 1-3 

PURPOSE: To ensure that response rates used to evaluate survey estimates are 
computed consistently across all NCES surveys. To calculate and report response rates 
that measure the proportion of the sample frame that is represented by the responding 
units in each study. 

KEY TERMS: cross-sectional, base weight, estimation, frame, item nonresponse, 
longitudinal, overall unit nonresponse, probability of selection, required response items, 
response rate, stage of data collection, strata, substitution, survey, total nonresponse, 
unit nonresponse, and wave. 


STANDARD 1-3-1: All response rates must be calculated using the sample base 
weights (i.e., the inverse of the probability of selection) when weighting is employed. 
Report the weighted unit response rates for each stage of data collection (e.g., schools, 
students, teachers, administrators), and for overall unit response rates. Report the range 
of total response rates for items included in each publication. Also, report specific item 
and total response rates when the item response rates fall below 70 percent (see 
Standards 2-1 and 2-2 for response rates and survey design issues, see Standard 3-2 on 
methods for achieving acceptable response rates, and see Standard 7-2 for response rate 
reporting requirements). 

GUIDELINE 1-3-1A: Unweighted response rates may be used for monitoring field 

operations (see Standard 1-3-3). 


STANDARD 1-3-2: Unit response rates (RRU) are calculated as the ratio of the 
weighted number of completed interviews (I) to the weighted number of in-scope 
sample cases (AAPOR 2000). There are a number of different categories of cases that 
comprise the total number of in-scope cases: 

I = weighted number of completed interviews; 

R = weighted number of refused interview cases; 

O = weighted number of eligible sample units not responding for reasons other than 
refusal; 

NC = weighted number of noncontacted sample units known to be eligible; 

U = weighted number of sample units of unknown eligibility, with no interview; and 
e = estimated proportion of sample units of unknown eligibility that are eligible. 

The unit response rate represents a composite of these components: 

RRU = 

I + R + 0 + NC + e(U ) 
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EXAMPLE: In a school-based survey, the numerator of the unit response rate is 
the number of responding schools. The denominator includes the number of 
responding schools plus the summation of the number of schools that refused to 
participate, the number of eligible schools that were nonrespondents for reasons 
other than refusal, and an estimate of the number of eligible schools from those 
with unknown eligibility. Note that in this school-based survey example, there 
are no cases reported in the category for the number of eligible schools that 
were not successfully contacted. In this case, eligibility can only be determined 
by contacting a respondent for the sampled school. 


STANDARD 1-3-3: Overall unit response rates for cross-sectional analysis (RRO ) are 
calculated as the product of two or more unit-level response rates when a survey has 
multiple stages. 


K 

rr ° c = 

1=1 

Where K = the number of stages and C denotes cross-sectional. 

There may be instances where fully accurate, current-year frame data are available for 
all cases at each stage of a survey; in that case, the estimation of overall response rates 
could be improved. However, in the absence of current-year frame data (as is usually 
the case), such improvements are not possible and the above formula should be used. 


STANDARD 1-3-4: Special procedures are needed for longitudinal surveys where 
previous nonrespondents are eligible for inclusion in subsequent waves. The overall 
unit response rate used in longitudinal analysis (RRO L ) reflects the proportion of all 
eligible respondents in the sample who participated in all waves in the analysis, 
multiplied by the product of the response rates for all but the last stage of data 
collection used in the analysis. In some longitudinal surveys, some of the stages 
surveyed for the first wave are not resurveyed in subsequent waves, but the unit 
response rates for the earlier stages are components of the overall unit response rates 
for subsequent waves. 


RRO 


K-\ 


(I L +R + 0 + NC +e(U) + W) JK 


RRU 


Where K = the last stage of data collection used in the analysis; 

J = the last wave in the analysis; 

I L = the weighted number of responding cases common to all waves in the 
analysis; 
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W = respondents to the last wave in the analysis who were nonrespondents in 
at least one of the preceding waves in the analysis; and 
riRRUi = the product of the unit response rates for all but the last stage of data 
collection. 

EXAMPLE: For an example in which the respondent in one stage is not 
resurveyed in subsequent waves, consider a teacher survey where states must be 
contacted to get a list of schools. This results in a first-stage unit response rate 
for the school listing activity (RRUi). The schools must then be contacted to 
obtain a list of teachers. This results in a second-stage unit response rate for the 
teacher listing activity (RRUo). Then, once a teacher sample is drawn from the 
lists, the teacher component of the survey has a third-stage unit response rate for 
the responding teachers (RRU3). The product of the first-, second-, and third- 
stage unit response rates is the overall response rate for teachers in the first 
wave of the data collection. To examine changes in job status, teachers are 
followed up in the second wave in the next school year (RRU4) and in the third 
wave the following year (RRU5). In an analysis that looks only at the results 
from the first and third waves, the response rate for teachers is the product of 
the response rate for the school listing function (RRUi), the response rate for the 
teacher listing function (RRU2), and the response rate for teachers eligible in 
both waves of the survey (i.e., the intersection of RRU3 and RRU5). 

GUIDELINE 1-3-4A: The product of the unit response rate across all stages and 
waves used in an analysis is approximately equal to the equation for RRO L . 


STANDARD 1-3-5: Item response rates (RRI) are calculated as the ratio of the number 
of respondents for whom an in-scope response was obtained (I x for item x) to the 
number of respondents who are asked to answer that item. The number asked to answer 
an item is the number of unit-level respondents (I) minus the number of respondents 
with a valid skip for item x (V x ). When an abbreviated questionnaire is used to convert 
refusals, the eliminated questions are treated as item nonresponse. 


r 

RR1 X = 

/ _ y x 


In longitudinal analyses, the numerator of an item response rate includes cases that 
have data available for all waves included in the analysis and the denominator includes 
the number of respondents eligible to respond in all waves included in the analysis. 

In the case of constructed variables, the numerator includes cases that have available 
data for the full set of items required to construct the variable, and the denominator 
includes all respondents eligible to respond to all items in the constructed variable. 
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EXAMPLE: In a survey of postsecondary faculty, while all respondents are 
asked to report the number of hours spent teaching classes per week, only those 
who report actually teaching classes are asked about the number of hours spent 
teaching remedial classes (I x ). In this case, the denominator of the item response 
rate excludes faculty who do not teach classes (I - V x ). 

In the case of a longitudinal analysis, when all faculty are followed up in the 
next year to monitor time spent on teaching remedial classes, the numerator of 
the item response rate for this variable is the number of faculty who responded 
to this variable in both years. The denominator includes all who were asked in 
both years. 

Faculty job satisfaction is measured using a constructed variable that is the 
average of 3 separate items — satisfaction with professional development, 
satisfaction with administration, and satisfaction with teaching assignment. 

Only full-time faculty members are eligible to answer the satisfaction items. 

The numerator includes all full-time faculty who answered all 3 satisfaction 
items and the denominator includes all full-time faculty who completed a 
faculty questionnaire. 


STANDARD 1-3-6: Total response rates (RRT X ) for specific items are calculated as 
the product of the overall unit response rate (RRO) and the item response rate for item 
x (RRI X ). 

RRT X = RRO * RRI X 


EXAMPLE: The product of the overall response rate from a faculty survey 
(RRO) and the item response rate for income (RRI X ) is the item-specific total 
response rate for faculty income. 


STANDARD 1-3-7: To supplement a sample when too few cases are obtained, one or 
more independent random samples of the population or sampling strata can be drawn 
and released. When this is done, the released samples must be used in their entirety. In 
this case, reported response rates must be based on the original and the added sample 
cases. 


EXAMPLE: In the event a random supplemental sample is fielded, all cases are 
included in the response rate — both the original and supplemental cases. 
Assume that six schools were sampled from a stratum, each with a base weight 
of 10. Four are respondents and two are nonrespondents. In addition, a 
supplemental sample of two schools was sampled from the stratum and was 
fielded in an attempt to compensate for the low initial rate of response. Both of 
the cases from the supplemental sample are respondents. Taking the combined 
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sample into account, each fielded school has a base weight of 7.5. The response 
rate then is: 

((7. 5+7. 5+7. 5+7. 5+7. 5+7. 5)/(7. 5+7. 5+7. 5+7. 5+7. 5+7. 5+7. 5+7. 5)) x 100 = 75%. 


STANDARD 1-3-8: Substitutions may only be done using matched pairs that are 
selected as part of the initial sample selection. If substitutions are used to supplement a 
sample, unit response rates must be calculated without the substituted cases included 
(i.e., only the original cases are used). 

EXAMPLE: As an example of the case where substitutes are used, but not 
included in the response rate, assume that two schools were sampled from a 
stratum. One has a base weight of 20 and the other has a base weight of 10. The 
first school is a respondent, while the school with a base weight of 10 does not 
respond. However, a matched pair methodology was used to select two 
substitutes for each case in the original sample. After fielding the substitutes for 
the nonrespondent, the first substitute also did not respond, but the second 
substitute responded. Since we must ignore the substitutes, the response rate is: 

((20)/(20+ 1 0) xlOO = 66.67%. 

In multiple-stage sample designs, where substitution occurs only at the first stage, the 
first-stage response rate must be computed ignoring the substitutions. Response rates 
for other sampling stages are then computed as though no substitution occurred (i.e., in 
subsequent stages, cases from the substituted units are included in the computations). If 
multiple-stage sample designs use substitution at more than one stage, then the 
substitutions must be ignored in the computation of response rate at each stage where 
substitution is used. 


REFERENCE 

American Association for Public Opinion Research (AAPOR). (2000). Standard 
Definitions: Final Dispositions of Case Codes and Outcome Rates for Surveys. Ann 
Arbor, MI: AAPOR. 
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SUBJECT: CODES AND ABBREVIATIONS 


NCES STANDARD: 1-4 

PURPOSE: To provide uniform codes, abbreviations, and acronyms for use in NCES 
data collection and processing that will facilitate the exchange of information and 
ensure uniformity in NCES data releases. 

KEY TERMS: Consolidated Metropolitan Statistical Area (CMSA), New England 
County Metropolitan Area (NECMA), Metropolitan Statistical Area (MSA), and 
Primary Metropolitan Statistical Area (PMSA). 


STANDARD 1-4-1: The National Institute of Standards and Technology maintains a 
variety of abbreviations under the Federal Information Processing Guidelines (FIPS 
PUBS). (See www.itl.nist.gov/fipspubs/index.htm for the most recent versions of these 
standards.) The following FIPS standards, or more current updates, must be used in all 
NCES data releases: 

FIPS PUB NUMBERS 

5- 2 States and Outlying Areas of the United States 

6- 4 County and County Equivalent of the States of the United States and DC 

8- 6 Metropolitan Areas, including Metropolitan Statistical Areas (MSAs), 

Consolidated Metropolitan Statistical Areas (CMSAs), Primary Metropolitan 
Statistical Areas (PMSAs), and related units called New England County 
Metropolitan Areas (NECMAs) 

9- 1 Congressional Districts of the United States 
92 Standard Occupational Codes (SOC) 

STANDARD 1-4-2: The North American Industry Classification System (NAICS) 
was developed jointly by the United States, Canada, and Mexico to provide new 
comparability in statistics about business activity across North America. NAICS 
coding has replaced the U.S. Standard Industrial Classification (SIC) system, 
previously released as FIPS Publication 66. NAICS codes must now be used instead of 
SIC codes for industry coding. (See Standard 2-5 for guidance on maintaining 
comparability when adopting NAICS coding for existing data series.) Current NAICS 
codes may be obtained from the U.S. Census Bureau at: 
www.census.gov/epcd/www/naics.html . 


STANDARD 1-4-3: The following IES-sponsored coding systems must be used, 
where applicable: 

1. The Classification of Instructional Programs (CIP), which is the accepted federal 
government statistical standard on instructional program classifications at the post- 
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secondary level. (See Classification of Instructional Programs [CIP-2000 Edition] , 
2002 [NCES 2002-165]. U.S. Department of Education. Washington, DC: National 
Center for Education Statistics.) To access an electronic version of this publication, 
see www.nccs.ed.aov/ipcds/pdf/webBasc/cir)man.pdf . 

2. The College Course Map (CCM), which is a classification scheme for college 
courses offered in the United States. (See Adelman, C. 1995. The New College 
Course Map and Transcript Files. Washington, DC: U.S. Department of 
Education, National Institute on Postsecondary Education, Libraries, and Lifelong 
Learning.) 

3. The Secondary School Taxonomy, which is a classification scheme for high school 
courses offered in the United States (See Bradby, D., and Hoachlander, G. (1999). 
1998 Revision of the Secondary School Taxonomy (NCES 1999-06). U.S. 
Department of Education. Washington, DC: National Center for Education 
Statistics Working Paper. 


STANDARD 1-4-4: Where appropriate, the NCES Publications Guide must be 
utilized, along with the United States Government Printing Office Style Manual ( GPO 
Style Manual). Official national, state, and international abbreviations are listed on 
pages 147-170 of the Style Manual, 2000 edition. These abbreviations must be used 
where appropriate in NCES publications. The current version of the NCES 
Publications Guide may be found at www.ed.gov/offices/QERI/MIS/guide.html . The 
GPO Style Manual may be obtained at the GPO web site (www.access.gpo.gov) . 
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SUBJECT: DEFINING RACE AND ETHNICITY DATA 


NCES STANDARD: 1-5 

PURPOSE: To provide common language to promote uniformity and comparability 
for the collection and reporting of data on race and ethnicity. This standard is in 
compliance with the definitions and procedures included in the 1997 revision of the 
OMB Statistical Policy Directive No. 15. 

KEY TERMS: American Indian or Alaska Native, Asian, Black or African 
American, confidentiality, edit, Hispanic or Latino, imputation, Native Hawaiian 
or Other Pacific Islander, public-use data file, White, and survey. 


STANDARD 1-5-1: Pending further government-wide research on the best practices 
for collecting information about race and ethnicity on individual-level surveys, NCES 
will follow OMB guidelines on the use of a two-question format — except under rare 
circumstances in which a one-question format is justified on the basis of research or 
other documentation. 

With the two-question format, the ethnicity question must come first, followed by the 
question on race. 

Ethnicity is based on the following categorization: 

Hispanic or Latino: A person of Cuban, Mexican, Puerto Rican, South or 
Central American, or other Spanish culture or origin, regardless of race. The 
term "Spanish origin" can be used in addition to " Hispanic or Latino.” 

Race is based on the following five categorizations: 

American Indian or Alaska Native: A person having origins in any of the 
original peoples of North and South America (including Central America), and 
who maintains tribal affiliation or community attachment. 

Asian: A person having origins in any of the original peoples of the Far East, 
Southeast Asia, or the Indian subcontinent, including, for example, Cambodia, 
China, India, Japan, Korea, Malaysia, Pakistan, the Philippine Islands, Thailand, 
and Vietnam. 

Black or African American: A person having origins in any of the black racial 
groups of Africa. Terms such as "Haitian" or "Negro" can be used in addition to 
"Black or African American." 

Native Hawaiian or Other Pacific Islander: A person having origins in any of 
the original peoples of Hawaii, Guam, Samoa, or other Pacific Islands. 

White: A person having origins in any of the original peoples of Europe, the 
Middle East, or North Africa. 

The race question must allow respondents to choose one or more of the listed 
categories. Taken together, the Hispanic/Latino category from the ethnicity question 


28 



and the 5 race categories result in 64 possible combinations of race and Hispanic 
ethnicity. 1 

The ethnicity question is: 

What is this person’s ethnicity? 

Hispanic or Latino 
Not Hispanic or Latino 

The race question is: 

What is this person’s race? Mark one or more races to indicate what this person 
considers him self/herself to be. 

White 

Black or African American 
Asian 

American Indian or Alaska Native 
Native Hawaiian or Other Pacific Islander 2 

GUIDELINE 1-5-1A: Generally, data collections will only include the categories 
that are listed above in the sample questions. The two ethnicity and five race 
categories represent the minimum categories established by OMB. However, in 
cases where the sample size is sufficient, NCES may elect to expand the ethnicity 
question to a format similar to the 2000 Decennial Census question to ask about 
specific Hispanic or Latino ethnicities. 

EXAMPLE: 

Is this person Hispanic or Latino? 

No, not Hispanic/Latino 

Yes, Mexican, Mexican American, Chicano 

Yes, Puerto Rican 

Yes, Cuban 

Yes, other Spanish/Latino (specify ) 

Similarly, if there is a need for more detail and the sample size can support it, an 
expanded list of races may be used. If more detail is collected, it must be possible 
to aggregate the data into the minimum categories specified by OMB. 


STANDARD 1-5-2: The OMB standards “shall be used for all Federal administrative 
reporting or record keeping that include data on race and ethnicity.” However, 
“agencies that cannot follow these standards must request a variance from OMB.” The 
Department of Education requested and received an OMB variance to allow time for 
the development of a single Department reporting standard for administrative record 


1 See appendix A for a full list of the 64 categories. 

2 The categories are presented in order of numerical frequency in the population, rather than 
alphabetically. Previous research studies have found that following alphabetical order in the question 
categories creates difficulties. That is, having “American Indian or Alaska Native” as the first category 
results in substantial over reporting of this category. 
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data. Under the existing variance, the Department will publish categories that are to be 
implemented in fall 2004. The following text is taken from the OMB ’s 1999 Draft 
Provisional Guidance on the Implementation of the 1997 Standards for the Collection 
of Federal Data on Race and Ethnicity section on Standards for Monitoring, Collecting, 
and Presenting Federal Data on Race and Ethnicity data formats using a two-question 
format: 

“To provide flexibility and ensure data quality, separate questions shall be 
used whenever feasible for reporting race and ethnicity. When race and 
ethnicity are collected separately, ethnicity shall be collected first. If race and 
ethnicity are collected separately, the minimum designations are: 

Race: 

American Indian or Alaska Native 
Asian 

Black or African American 

Native Hawaiian or Other Pacific Islander 

White 

Ethnicity: 

Hispanic or Latino 
Not Hispanic or Latino 

When data on race and ethnicity are collected separately, provision shall be 
made to report the number of respondents in each racial category who are 
Hispanic or Latino. 

When aggregate data are presented, data producers shall provide the number 
of respondents who marked (or selected) only one category, separately for 
each of the five racial categories. In addition to these numbers, data producers 
are strongly encouraged to provide the detailed distributions, including all 
possible combinations, of multiple responses to the race question. If data on 
multiple responses are collapsed, at a minimum the total number of 
respondents reporting “more than one race” shall be made available.” 


STANDARD 1-5-3: Full detail on race and ethnicity as reported by individuals or 
collected from administrative data must be maintained on restricted-access data files 
and on public-use data files, within the constraints imposed by relevant confidentiality 
laws and administrative policies (see Standard 4-2). 

GUIDELINE 1-5-3A: Survey documentation should describe how race and 
ethnicity questions were asked, how imputation and edits were accomplished, and 
what decisions were made to create aggregation categories. 


STANDARD 1-5-4: When reporting data on race and ethnicity in government 
publications, every effort must be made to use at least the minimal reporting categories, 
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described below, whenever possible. More categories should be used when there are 
enough cases to support finer detail. However, if there are not enough cases in any 
individual category of race or Hispanic ethnicity, the data for that category and for the 
next smallest category must be included in the total but not shown separately, and must 
be footnoted as such. Alternatively, if several categories cannot be shown, the 
combined categories must be reported as an “other” category, and footnoted to describe 
the exact components. 

The following are the desired minimal reporting categories for race and ethnicity in 
government publications. The decision rules for each combination of race and ethnicity 
are shown in italics: 

American Indian or Alaska Native, not Hispanic or Latino 

(This category includes only persons who reported American Indian or Alaska Native 
as their sole race and did not report Hispanic ethnicity.) 

Asian, not Hispanic or Latino 

(This category includes only persons who reported Asian as their sole race, but did not 
report Hispanic ethnicity.) 

Black, not Hispanic or Latino 

(This category includes only persons who reported Black as their sole race, but did not 
report Hispanic ethnicity.) 

Native Hawaiian or Other Pacific Islander, not Hispanic or Latino 

(This category includes only persons who reported Native Hawaiian or Other Pacific 

Islander as their sole race, but did not report Hispanic ethnicity.) 

White, not Hispanic or Latino 

(This category includes only persons who reported White as their sole race, but did not 
report Hispanic ethnicity.) 

More than one race, not Hispanic or Latino 

(This category includes any combination of more than one race and not Hispanic or 
Latino ethnicity or Hispanic or Latino ethnicity not reported.) 

Hispanic or Latino, regardless of race 

(This category includes Hispanic or Latino ethnicity and any combination of race.) 

GUIDELINE 1-5-4A: The names for the groups should be capitalized, per the 
United States Government Printing Office Style Manual, 2000 (e.g., White, Black, 
Asian). 

GUIDELINE 1-5-4B: When the publication contains substantial text, the category 
names may be abbreviated after the first presentation of the categories. The authors 
should introduce the shortened version of the category label by saying that the two 
are used interchangeably in the text. 

The following abbreviated names are suggested for use in text or in tables and 
figures: 
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American Indian (instead of American Indian or Alaska Native) 

Black (instead of Black or African American) 

Pacific Islander (instead of Native Hawaiian or Other Pacific Islander) 

Hispanic (instead of Hispanic or Latino) 

A footnote is needed to describe these “abbreviations” as follows: 

American Indian includes Alaska Native, Black includes African American, Pacific 
Islander includes Native Hawaiian, and Hispanic includes Latino. Race categories 
exclude Hispanic origin unless specified. 
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SUBJECT: DISCRETIONARY GRANT DESCRIPTIONS 
NCES STANDARD: 1-6 


PURPOSE: To assist NCES staff in the preparation of high quality discretionary grant 
descriptions. The description should include the information required to allow an 
applicant to submit a proposal that demonstrates technical and managerial competence 
sufficient to successfully complete a project. Each grant description should also 
include the selection criteria to be used in accordance with federal and Department of 
Education regulations. 


STANDARD 1-6-1: Grant descriptions must be written in compliance with guidelines 
established in the Education Department General Administrative Regulations 
(EDGAR). 

GUIDELINE 1-6-1A: The Grants Policy and Oversight Staff (GPOS) in the Office 
of the Chief Financial Officer can provide expertise and guidance in the 
development of the grant description and application process. 


STANDARD 1-6-2: The team leader for the grant is responsible for providing 
technical advice and recommendations to the prospective grantee. 

GUIDELINE 1-6-2A: Within NCES, the staff member who develops the 
application package and related documents should be designated as grant team 
leader. The individual who develops the application package should have 
completed required courses for administering the grants process. Minimally, the 
grant team leader should be included in the development process, and should be 
familiar with the grant requirements and expectations. 


STANDARD 1-6-3: The grant process must include the following four activities: 

1 . Submit the Application Notice for publication in the Federal Register. This invites 
applications for a competition, gives basic program and fiscal information, and 
informs potential applicants when and where they may obtain applications. 

2. Prepare the Grant Application Package, which must include the standard 
infonnation for all discretionary grant programs to comply with the policies and 
regulations of the Department and the Office of Management and Budget (OMB). 
In addition, include a clear, precise, and accurate description of the problem to be 
addressed and the expected activities, services, or products, and level of effort to be 
delivered under the grant. This includes technical, statistical, managerial, and 
product objectives. 

3. Provide Guidance for Completing Applications, which describes the required 
elements of a grant application package, including cover sheet, narrative of 
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proposed activities and budget for these activities, and assurances of compliance 
with requirements imposed by the U.S. Secretary of Education. 

4. Develop an Application Technical Review Plan that describes how applications for 
funding should be evaluated. This plan should include procedures for evaluating 
applications, including review panels, criteria for selecting reviewers, technical 
review forms, method for ranking applications for funding, and basis for 
recommending applications for funding. 

GUIDELINE 1-6-3 A: The application package should provide the applicant with a 
statement of statistical, temporal, and reporting guidelines for design, 
implementation, and analysis, as appropriate. Managerial guidelines should 
delineate those to be performed by the grantee and those to be performed by NCES. 
The products (e.g., analysis plans, final reports) should be tenned “deliverables” 
and guidelines for due dates should be provided. 
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PLANNING AND DESIGN OF SURVEYS 


2-1 Design of Surveys 

2-2 Survey Response Rate Parameters 

2-3 Developing a Request for Proposals (RFP) for 
Surveys 

2-4 Pretesting Survey Systems 

2-5 Maintaining Data Series Over Time 

2-6 Educational Testing 
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SUBJECT: DESIGN OF SURVEYS 


NCES STANDARD: 2-1 

PURPOSE: To identify the survey design components required to conduct a data 
collection. 

KEY TERMS: confidentiality, domain, estimation, field test, frame, individually 
identifiable data, key variables, planning document, precision, probability of selection, 
response rate, strata, survey, survey system, target population, and variance. 


STANDARD 2-1-1: A technical document that delineates the basic design of a survey 
or survey system must be developed prior to the initiation of a data collection. The 
document must address the objectives of the survey as indicated in the initial planning 
document; the survey design; the data collection plan; and the personnel resources, 
funds, and time needed to achieve high data quality. To meet this standard, the survey 
design plan must include the following: 

1. A detailed discussion of the goals and objectives of the survey or survey system, 
including the information needs that will be met, content areas included, target 
population(s), and analytic goals. 

2. A discussion of the sample design that describes how it will yield the data required 
to meet the objectives of the survey. The discussion must include the 
following: identification of the sampling frame and the adequacy of the frame (see 
Standard 3-1); sampling strata; power analyses to determine sample sizes for key 
variables by reporting domains, sample size by stratum; the known probability of 
selection; expected yield by stratum; estimated efficiency of sample design; 
weighting plan; variance estimation techniques appropriate to the survey design; 
and expected precision of estimates for key variables. 

3. A listing of all survey data items, including time series data items, how each item 
can best be measured (e.g., through questionnaires, tests), and reasonable evidence 
that these items are valid and can be measured both accurately and reliably. 

4. An analysis plan providing evidence that the basic information needs which justify 
the study can be met through the proposed data collection. The plan must 
demonstrate how the proposed sample, the survey items, and the measurement 
methods are related to the objectives of the survey. 

5. The anticipated data collection procedures, including timing of data collection; 
primary mode of collection; and methods for achieving acceptable response rates 
(see Standard 3-2). 

6. A plan for preserving the confidentiality of the data during collection, processing, 
and analysis, if individually identifiable data will be collected. An analysis plan for 
disclosure risk control is also required to prepare a public-use data file (see 
Standard 4-2). 
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7. An outline of a plan for quality assurance during each phase of the survey process 
that will permit monitoring and assessing the performance during implementation. 
The plan must include contingencies to modify the survey procedures, if design 
parameters appear unlikely to meet expectations (for example, low response rates) 
(see Standard 3-3). 

8. A plan for field testing the survey or survey system (see Standard 2-4). 

9. An outline of the general parameters for evaluating survey procedures and results 
(see Standard 4-3). 

10. General specifications for an internal project management system that identifies 
critical activities and key milestones of the survey that will be monitored, and the 
time relationships among them (see Standard 3-3). 

1 1 . An Independent Government Cost Estimate (IGCE) for the entire study, including, 
for example, the pilot test, the main study, file preparation and documentation, 
disclosure risk analysis, the survey evaluation, and analysis and reporting. 
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SUBJECT: SURVEY RESPONSE RATE PARAMETERS 


NCES STANDARD: 2-2 

PURPOSE: To specify design parameters for survey response rates. High survey 
response rates help to ensure that survey results are representative of the target 
population. Surveys conducted by or for NCES must be designed and executed to meet 
the highest practical rates of response. To ensure that nonresponse bias analyses are 
conducted when response rates suggest the potential for bias to occur. 

K E Y TERMS: assessment, cross-sectional, key variables, longitudinal, nonresponse 
bias, response rate, stage of data collection, substitution, survey, target population, and 
universe. 


STANDARD 2-2-1: Universe data collections must be designed to meet a target unit 
response rate of at least 95 percent. 

GUIDELINE 2-2-1A: A unit-level nonresponse bias analysis is recommended in 
the case where the universe survey unit response rate is less than 90 percent. (See 
Standard 4-4 for a discussion of nonresponse bias analysis.) 


STANDARD 2-2-2: Sample survey unit response rates must be calculated without 
substitutions (see Standard 1-3). NCES sample survey data collections must be 
designed to meet unit-level response rate parameters that are at least consistent with 
historical response rates from surveys conducted with best practices. 

GUIDELINE 2-2-2A: The following parameters summarize current NCES 

historical experiences: 

1. For longitudinal sample surveys, the target school-level unit response rate 
should be at least 70 percent. In the base year and each follow-up, the target 
unit response rates at each additional stage should be at least 90 percent. 

2. For cross-sectional samples, the target unit response rate should be at least 85 
percent at each stage of data collection. 

3. For random-digit dial sample surveys, the target unit response rate should be at 
least 70 percent for the screener and at least 90 percent for each survey 
component. 

4. For household sample surveys, the target response rates should be at least 90 
percent for the screener and at least 85 percent for the respondents. 

5. For assessments, the target response rate should be at least 80 percent for 
schools and at least 85 percent for students. 
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Stage-specific design response rates, by type of survey 


Type of survey 

Stage-specific design response rates 

Screener 

School 

All other 

Universe 

— 

— 

95 

Cross-sectional 

— 

85 

85 

Longitudinal 

— 

70 

90 

Assessment 

— 

80 

85 

Random-Digit Dial 

70 

— 

90 

Household 

90 

— 

85 


STANDARD 2-2-3: NCES sample survey data collections must be designed to meet a 
target item response rate of at least 90 percent for each key item. 


STANDARD 2-2-4: A nonresponse bias analysis is required at any stage of a data 
collection with a unit response rate less than 85 percent. If the item response rate is 
below 85 percent for any items used in a report, a nonresponse bias analysis is also 
required for each of those items (this does not include individual test items). The extent 
of the analysis must reflect the magnitude of the nonresponse (see Standard 4-4). 

In longitudinal sample surveys, item nonresponse bias analyses need only be done once 
for any individual item, unless there is a substantial deterioration in the item response 
rate. 


STANDARD 2-2-5: In cases where prior experience suggests the potential for an 
overall unit response rate of less than 50 percent, the decision to proceed with data 
collection must be made in consultation with the Associate Commissioner, Chief 
Statistician, and Commissioner. 
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SUBJECT: DEVELOPING A REQUEST FOR PROPOSALS (RFP) FOR 
SURVEYS 

NCES STANDARD: 2-3 

PURPOSE: To assist NCES staff in the preparation of high quality RFPs. Each RFP 
should include the information required to allow an offeror to submit a proposal that 
demonstrates technical and managerial competence sufficient to complete successfully 
all phases of surveys. Each RFP should include evaluation criteria to assist the 
government in selecting the best offeror to conduct the work. The RFP should provide a 
clear, precise, and accurate description of the requirement for the work and the 
expected activities, services, products, and level of effort to be delivered under the 
contract. 

KEY TERMS: award incentive plan, survey, and survey system. 


STANDARD 2-3-1: RFPs must be written in compliance with guidelines established in 
the Federal Acquisition Regulations (FAR) and in other departmental administrative 
procedures and guidelines. 

GUIDELINE 2-3-1A: The contracting office of the Department of Education is 
responsible for the acquisition process for NCES and can provide expertise and 
guidance in the development of the RFP. 

GUIDELINE 2-3-1B: Within NCES, the staff member who is responsible for the 
development of a Statement of Work (SOW) and related documents should also be 
designated Contracting Officer's Representative (COR). The staff member 
responsible for the development of the SOW should have completed courses 
required for COR certification. Minimally, the individual designated as COR 
should be included in the development process, to provide familiarity with the 
contractual requirements and expectations. 


STANDARD 2-3-2: The Statement of Work (SOW) must contain technical, 
managerial, and deliverable specifications (see Standards 1-1 and 2-2). 

GUIDELINE 2-3-2A: The technical specifications for all phases of design, 
implementation, and analysis include methodological, statistical, timeline, resource, 
analysis, and data file parameters. Managerial specifications should be written as 
specific activities and tasks. Those to be performed by the contractor and those to 
be perfonned by NCES should be clearly delineated. There should be a schedule for 
all deliverables (e.g., analysis plans, final reports). 
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STANDARD 2-3-3: The COR must be fully certified and must maintain COR 
certification. COR certification requires courses on contracting overview, independent 
government cost estimates, preparing performance-based statements of work, and 
contract administration. To maintain COR certification, the COR must complete an 
advanced contract administration course every 2 years as well as periodic required 
courses, such as courses on the Department of Education’s financial management 
system, EDCAPS, and the Contracts and Purchasing Software System (CPSS). 


STANDARD 2-3-4: The COR must develop an Independent Government Cost 
Estimate (IGCE) that includes estimates of the cost of the project for all phases and 
elements of the survey system in terms of the contractor's manpower commitment by 
labor categories and other related costs. Automated Data Processing (ADP) cost, or 
Information Technology (IT) costs, must be estimated within each of the budget 
categories, to yield an estimate of total ADP costs within the total budget. Total 
estimated cost must not exceed the NCES budget amount for the project. 

GUIDELINE 2-3-4A: For further infonnation, consult previous comparable 

project estimates. 


STANDARD 2-3-5: To obtain funding commitment, the COR must initiate the 
authorization and have it approved by the Division’s Associate Commissioner. The 
COR must confirm the survey’s fiscal year scheduled activity and obtain all accounting 
infonnation with the budget contact source in the Office of the Deputy Commissioner 
(ODC). The ODC will commit the survey funds in the Department’s financial system 
and electronically submit the authorization to the Contracting Officer (CO). 


STANDARD 2-3-6: The Proposal Evaluation Plan specifies the membership of the 
Technical Evaluation Panel (TEP), who serve as advisers to the Contracting Officer 
(CO). The plan also provides the criteria on which the COR and the TEP assess the 
proposals. The COR, in collaboration with the CO, assigns the factors and weights 
associated with each criterion. Only criteria and weights stated in the RFP may be used 
to evaluate submitted proposals (see Standards 1-1 and 2-2). 

GUIDELINE 2-3-6A: The criteria may include such factors as technical 
competence, analysis plan, familiarity with data files, and management plan. 


STANDARD 2-3-7: The Proposal Preparation Instructions inform the offeror as to the 
substantive, format, and organizational requirements for completing their proposal. 
The offeror must submit two separate proposals: (1) technical and (2) business. They 
are evaluated separately. 
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STANDARD 2-3-8: The COR must prepare the required clearances and approvals for 
the planned survey activity. The standard clearances for all new RFPs are currently the 
Information Technology (IT) Resources clearance, Impact Determination clearance, 
and the Administrative Test for Characterizing Particular Services as “Personal” or 
“Nonpersonal” clearance. 

GUIDELINE 2-3-8A: Each RFP survey may have its own applicable/special 
clearances depending on the type of procurement required. (The ACS Departmental 
Directive, C: GPA: 2-105, Acquisition Planning, dated June 10, 1992 or later 
should be referenced to explain the standard clearances noted above and possible 
other clearances or approvals that might be required.) 


STANDARD 2-3-9: The Award Incentive Plan for a performance-based contract must 
include a description of deliverables, schedules, and other evaluation criteria. It must 
also provide definitions of quality for each criterion and the associated incentive award 
fee or penalty. The evaluation criteria must include, but are not limited to, the 
definition of the work in measurable and/or mission-related terms. 

GUIDELINE 2-3-9A: This plan tells the contractor what activity or product is 
required to be considered for an award incentive, above and beyond the acceptable 
standards for the contract. It also tells the contractor when penalties may be applied. 
In addition to a specified set of activities or products, NCES may include an option 
to pre-select at random additional deliverables for award or penalty. 

GUIDELINE 2-3-9B: Award incentives criteria frequently include such factors as 
quantity, timeliness, or quality. Other criteria that are sometimes used include 
commercial or industry-wide standards that are used to measure performance. 

GUIDELINE 2-3-9C: An award fee incentive can be applied as a specified amount 
for a specific deliverable or the award fee can be applied in increments related to 
quality of the deliverable. Award incentive fees are based on the Contracting 
Officer’s Representative’s (COR) evaluation and ranking of the deliverables. The 
amount of the award incentive fee is determined by negotiations involving the 
COR, NCES senior management, and the Contracting Officer prior to awarding the 
contract. 

GUIDELINE 2-3-9D: The following documents offer specific guidance on how to 
develop a performance-based solicitation: 

1. “Information on Best Practices for Performance-Based Service Contracting,” 
October 1998, published by the Office of Federal Procurement Policy at OMB. 

2. “Federal Acquisition Circular 97-1 .” 

3. “Federal Acquisition Regulation Subpart 37.6.” 

These documents are accessible through the Acquisition Reform Network 
(www.amet.gov) . 
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SUBJECT: PRETESTING SURVEY SYSTEMS 


NCES STANDARD: 2-4 

PURPOSE: To ensure that all components of a survey system will function as intended 
when implemented in the full-scale survey. 

K E Y TERMS: edit, estimation, field test, frame, imputation, instrument, pretest, 
response rate, stage of data collection, survey, survey system, and variance. 


STANDARD 2-4-1: One type of a pretest is a pilot test in which some components of 
a survey system can be pretested prior to a field test of the survey system (for example, 
focus groups, cognitive laboratory work, pilot tests, and or calibration studies). 


STANDARD 2-4-2: A second type of pretest is a field test. Components of a survey 
system that cannot be successfully demonstrated through previous work must be field 
tested prior to implementation of the full-scale survey. The design of a field test must 
reflect realistic conditions, including those likely to pose difficulties for the survey. 
Documentation of the field test (e.g., materials for technical review panels, working 
papers, technical reports) must include the design of the field test; a description of the 
procedures followed; analysis of the extent to which the survey components met the 
pre-established criteria; discussion of other potential problems uncovered during the 
field test; and recommendations for changes in the design to solve the problems. 

GUIDELINE 2-4-2A: Elements to be tested and measured may include alternative 
approaches to accomplishing a particular task. Elements to be tested may include 
frame development; sample selection; questionnaire design; data collection; 
response rates; data processing (e.g., entry, editing, imputation); estimation (e.g., 
weighting, variance computation); file creation; and tabulations. 

GUIDELINE 2-4-2B: For an ongoing survey, new elements or content should be 
field tested, along with elements being changed as a result of the evaluation of the 
survey (see Standard 4-3). 

GUIDELINE 2-4-2C: The evaluation criteria for a successful field test should be 
developed before the field test begins. Key evaluation criteria are established during 
the design stage. If the criteria are not met, that survey component should not be 
implemented without field testing a redesigned component. 
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GUIDELINE 2-4-2D: The results of a field test should be available and analyzed 
for internal use prior to making a decision to implement the full-scale survey. 

GUIDELINE 2-4-2E: Survey design and instrumentation should be revised to 
reflect modifications suggested by the results of the field test. A revised budget 
should be developed, if necessary, to reflect both changes in design and knowledge 
gained during the field test about resource requirements. 
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SUBJECT: MAINTAINING DATA SERIES OVER TIME 


NCES STANDARD: 2-5 

PURPOSE: To maintain and report NCES data series that are consistent over time. 

KEY TERMS: bridge study, consistent data series, crosswalk study, key variables, and 
survey. 


STANDARD 2-5-1: NCES must maintain and report on a consistent set of data series 
that may be analyzed over time. Ongoing data collections must maintain and report on a 
consistent set of key variables, which are based on consistent data collection 
procedures. 

GUIDELINE 2-5-1 A: Identify the basic key variables to be assessed on a regular 
basis to address policy issues and other infonnation needs. 

GUIDELINE 2-5-1B: Provide estimates of both change and level for time series 
data in reports. For survey reports, consider publishing 3 or more years of the time 
series data along with the current year to highlight the time series. 

GUIDELINE 2-5-1C: Provide a list of other publications containing the data for 
previous years in the appendix of a survey report. 


STANDARD 2-5-2: Continuous improvement efforts sometimes result in a trade-off 
between the desire for consistency and a need to improve a data collection. If changes 
are needed in key variables or survey procedures for data series, a plan must be 
developed that provides the justification or rationale for the changes in terms of their 
usefulness for policymakers, conducting analyses, and addressing information needs. 
The plan must also describe adjustment methods, such as crosswalks and bridge studies 
that will be used to preserve trend analyses. 



SUBJECT: EDUCATIONAL TESTING 


NCES STANDARD: 2-6 

PURPOSE: To ensure that educational tests used in NCES surveys for measuring and 
making inferences about education-related domains are valid, technically sound, and 
fair. To ensure that the administration and scoring of educational tests are standardized, 
that scales used over time are stable, and that the results are reported in a clear unbiased 
manner. 

K E Y TERMS: accommodation, assessment, classical test theory, cut score, derived 
score, Differential Item Functioning (DIF), disability, domain, equating, fairness, field 
test, Individualized Education Plan (IEP), instrument, Item Response Theory (IRT), 
linkage, precision, reliability, scaling, scoring/rating, Section 504, survey, and validity. 


STANDARD 2-6-1: Instrument Development — All test instruments used in NCES 
surveys must be developed following an explicit set of specifications. The development 
of the instrument must be documented so that it can be replicated. The instrument 
documentation must include the following: 

1 . Purpose(s) of the instrument; 

2. Domain or constructs that will be measured; 

3. Framework of the instrument in terms of items, tasks, questions, response 
formats, and modes of responding; 

4. Number of items and time required for administration; 

5. Context in which the instrument will be used; 

6. Characteristics of intended participants; 

7. Desired psychometric properties of the items, and the instrument as a whole; 

8. Conditions and procedures of administering the instrument; 

9. Procedures of scoring; and 

10. Reporting of the obtained scores. 

GUIDELINE 2-6-1A: Relevant experts should review the domain definitions and 
the instrument specifications. The qualifications of the experts, the process by 
which the review is conducted, and the results of the review should be documented. 

GUIDELINE 2-6-1B: All items should be reviewed before and after pilot and field 
tests. Pilot and field tests should be conducted on subjects with characteristics 
similar to intended participants. The sample design for pilot and field tests should 
be documented. 
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GUIDELINE 2-6-1C: Field test sample should include an adequate number of 
cases with the characteristics necessary to detennine the psychometric properties of 
items. 

GUIDELINE 2-6-1D: Empirical analysis and the model (e.g., Classical and/or 
Item Response Theory) used to evaluate the psychometric properties of the items 
during the item review process should be documented. 

GUIDELINE 2-6-1E: When a time limit is set for perfonnance, the extent to 
which the scores include a speed component and the appropriateness of this 
component to the defined domain should be documented. 

GUIDELINE 2-6-1F: If the conditions of administration are allowed to vary across 
participants, the variations and rationale for them should be documented. 

GUIDELINE 2-6-1G: Directions for test administrations should be described with 
sufficient clarity for others to replicate. 

GUIDELINE 2-6-1H: When a shortened or altered form of an instrument is used, 
the differences from the original instrument and the implications of those 
differences for the interpretations of scores should be documented. 


STANDARD 2-6-2: Validity — All test instruments used in NCES surveys must meet 
the purpose(s) stated in the instrument specifications. All intended interpretations and 
proposed uses of raw scores, scale scores, cut scores, equated scores, and derived 
scores, including composite scores, sub-scores, score differences, and profiles, must be 
supported by evidence and theory. 

GUIDELINE 2-6-2A: Evidence of validity should be based on analyses of the 
content, response processes (i.e., the thought processes used to produce an answer), 
internal structure of the instrument, and/or the relationship of scores to a criterion. 

GUIDELINE 2-6-2B: The rationale for each intended use of the test instruments 
and test proposed interpretations of the scores obtained should be explicitly stated. 

GUIDELINE 2-6-2C: When judgments occur in the validation process, the 
selection process for the judges (experts/observers/raters) and the criteria for 
judgments should be described. 


STANDARD 2-6-3: Reliability — The scores obtained by a test instrument must be 
free from the effects of random variations due to factors such as administration 
conditions and/or differences between scorers. The reliability of the scores must be 
adequate for the intended interpretations and uses of the scores. 
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The reliability must be reported, either as a standard error of measurement or as an 
appropriate reliability coefficient (e.g., alternate form coefficient, test-retest/stability 
coefficient, internal consistency coefficient, generalizability coefficient). Methods 
(including selection of sample, sample sizes, sample characteristics) of quantifying the 
reliability of both raw and scale scores must be fully described. Scorer reliability, rater 
to rater, and rater-year reliability must be reported when the scoring process involves 
judgment. 

GUIDELINE 2-6-3A: All relevant sources of measurement errors and summary 
statistics of the size of the errors from these sources should be reported. 

GUIDELINE 2-6-3B: When average scores for participating groups are used, the 
standard error of measurement of group averages should be reported. Standard error 
statistics should include components due to sampling examinees, as well as 
components due to measurement error of the test instrument. 

GUIDELINE 2-6-3C: Reliability information on scores for each group should be 
reported when an instrument is used to measure different groups (e.g., 
race/ethnicity, gender, age, or special populations). 

GUIDELINE 2-6-3D: Reliability information should be reported for each version 
of a test instrument when original and altered versions of an instrument are used. 

GUIDELINE 2-6-3E: Separate reliability analyses should be performed when 
major variations of the administration procedure are permitted to accommodate 
disabilities. 


STANDARD 2-6-4: Fairness — Test instruments used in NCES surveys must be 
designed, developed, and administered in ways that treat participants equally and fairly, 
regardless of differences in personal characteristics such as race, ethnicity, gender, age, 
socioeconomic status, or disability that are not relevant to the intended uses of the 
instrument. 

GUIDELINE 2-6-4A: Language, symbols, words, phrases, and content that are 
generally regarded as offensive by members of particular groups should be 
eliminated, except when judged to be necessary for adequate representation of the 
domain. 

GUIDELINE 2-6-4B: Although differences in the subgroups’ performance do not 
necessarily indicate that a measurement instrument is unfair, differences between 
groups should be investigated to make sure that they are not caused by construct- 
irrelevant factors. 

GUIDELINE 2-6-4C: When research shows that Differential Item Functioning 
(DIF) exists, studies should be conducted to detect and eliminate aspects of test 
design, content, and fonnat that might bias test scores for a particular subgroup. 


49 



GUIDELINE 2-6-4D: In testing applications where the level of linguistic or 
reading ability is not a purpose of the assessment, the linguistic or reading demands 
of the test instrument should be kept to a minimum. 

GUIDELINE 2-6-4E: The testing or assessment process should be carried out so 
that test takers receive comparable and equitable treatment during all phases of the 
testing process. 


STANDARD 2-6-5: Testing individuals with disabilities or limited English 
proficiency — Whenever possible, scores derived from test instruments used in NCES 
surveys must validly, reliably, and fairly reflect the performance of all participants, 
including individuals with disabilities and individuals of diverse linguistic 
backgrounds. Although the exact procedures will vary across surveys, appropriate and 
reasonable accommodations in accordance with applicable federal nondiscrimination 
laws for special populations must be incorporated. Differences in perfonnance must 
reflect the construct measured rather than any construct-irrelevant factors such as 
disabilities and/or language differences. 

GUIDELINE 2-6-5A: Permitted accommodations and/or modifications for special 
populations and the rationale for each accommodation should be documented in the 
data file and survey methodology report. 

GUIDELINE 2-6-5B: The extent to which data gathered with accommodations 
meet measurement standards of validity and reliability should be documented. 

For individuals with disabilities: 

GUIDELINE 2-6-5C: Empirical procedures used to review items to ensure 
fairness, to evaluate whether DIF exists, and to determine accommodations for 
students/individuals with disabilities should be included in the documentation. 

GUIDELINE 2-6-5D: Decisions about accommodations for individuals with 
disabilities should be made by individuals who are knowledgeable of existing 
research on the effects of the specific disabilities on test perfonnance. 

GUIDELINE 2-6-5E: The participant's Individualized Education Plan (IEP) or 
Section 504 plan must be consulted prior to making detenninations of whether a 
participant with a disability will participate in the assessment, and what 
accommodations, if any, are appropriate. 

For individuals of diverse linguistic backgrounds: 

GUIDELINE 2-6-5F: Empirical procedures used to review items to ensure 
appropriateness of materials for participants with various backgrounds and 
characteristics (e.g., nativity, experience in U.S. schools) should be documented to 
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evaluate whether DIF exists, and to evaluate the linguistic or reading demands to 
ensure that they are no greater than required. 

GUIDELINE 2-6-5G: If an instrument is translated to another language, 
translation evaluation procedures, and the comparability of the translated instrument 
to the original version should be documented. 


STANDARD 2-6-6: Administration — Administration of all test instruments used in 
each NCES survey must be standardized. Test administration must follow procedures 
specified in the test administration manual. The administration manual must include 
descriptions of the following: 

1 . Brief statement of the purpose of the survey and the population to be tested; 

2. Required qualifications of those administering the instrument; 

3. Required identifying information of the participant; 

4. Materials, aids, or tools that are required, optional, or prohibited; 

5. Allowable instructions to the participants and procedures for timing the testing; 

6. Assignment of participants to groups, or special seating arrangements, and 
preparation of participants as relevant; 

7. Allowable accommodations; 

8. Desired testing condi tions/environment; and 

9. Procedures to maintain security of the materials as applicable, and actions to take 
when irregularities are observed. 

GUIDELINE 2-6-6A: Administration procedures should be field tested. The 
approved procedures should be described clearly so they can be easily followed. 

GUIDELINE 2-6-6B: Survey staff administering the instrument should be trained 
according to the procedures prescribed in the administration manual. 

GUIDELINE 2-6-6C: Modifications or disruptions to the approved procedures 
should be documented so the impact of such departures can be studied. 

GUIDELINE 2-6-6D: Instructions presented to participants should include 
sufficient detail to allow the participants to respond to the task in the manner 
intended by the instrument developer. 

GUIDELINE 2-6-6E: Samples of administration sites should be monitored to 
ensure that the instrument is administered as specified. 
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STANDARD 2-6-7: Scoring and Scaling — Test scoring must be standardized within 
each survey, and scales must be stable if used over time. 

GUIDELINE 2-6-7A: Machine-scoring procedures should be checked for 
accuracy. The procedure, as well as the nature and extent of scoring errors, should 
be documented. 

GUIDELINE 2-6-7B: Hand scoring procedures should be documented, including 
rules governing scoring decisions, training procedures used to teach the rules to the 
coding staff, quality monitoring system used, and quantitative measures of the 
reliability of the resulting ratings. Criteria for evaluating the quality of individual 
responses should not be changed during the course of the scoring process. 

GUIDELINE 2-6-7C: All systematic sources of errors during the scoring process 
should be corrected and documented. 

GUIDELINE 2-6-7D: Consistency among scorers and potential drift over time in 
scoring/rating should be evaluated and documented. 

GUIDELINE 2-6-7E: Meanings, interpretations, limitations, rationales, and 
processes of establishing the reported scores should be clearly described in the 
technical report. 

GUIDELINE 2-6-7F: Stability of the scale should be monitored and corrected or 
revised, when necessary, if a scale is maintained over time. 

GUIDELINE 2-6-7G: Procedures for scoring — raw scores, scale scores — should 
be documented. The documentation should also include a description of the 
populations used for their development. 

GUIDELINE 2-6-7H: Procedures for deriving the weights should be described 
when weights are used to develop the scale scores. 

GUIDELINE 2-6-71: Population norms to which the summary statistics refer 
should clearly be defined when group performance is summarized using norm 
scores. 

GUIDELINE 2-6-7 J: Rationales and procedures for establishing cut scores should 
be documented when cut scores are established as part of the scale score reporting. 

GUIDELINE 2-6-7K: Cut scores should be valid; that is, participants above a cut 
point should demonstrate a qualitatively greater degree and/or different type of 
skills/knowledge than those below the cut point. 

GUIDELINE 2-6-7L: The method employed in a judgmental standard-setting 
process should be documented. The documentation should include the following: 
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1 . Selection and qualifications of judges; 

2 . Nature of the request for their judgments; 

3 . T raining provided to the j udges; 

4. Feedback of information to judges; 

5. Opportunities forjudges to confer with one another concerning their judgments; 
and 

6. Methods used to aggregate the judgments and translate them into cut scores. 

GUIDELINE 2-6-7M: The judgmental methods used to establish cut scores should 
meet the following three criteria: 

1 . The judgmental method should involve peer review and pretesting. 

2 . The judgments to be provided should not be so cognitively complex that the 
judges are unable to provide meaningful judgments. 

3. The process used to set cut scores should be described in sufficient detail so the 
process can be replicated. 

GUIDELINE 2-6-7N: An estimate of the amount of variability in cut scores must 
be provided regardless of whether the standard-setting procedure is replicated. 

GUIDELINE 2-6-70: Equating/linking functions should be invariant across sub- 
populations when equating or linking is used to determine equivalent scores. 
Supporting evidence for the interchangeability of tests/test fonns should be 
provided. 

GUIDELINE 2-6-7P: Detailed technical information (i.e., design of equating 
studies, standard errors of measurement, statistical methods used, size and relevant 
characteristics of samples used, and psychometric properties of anchor items) 
should be provided for the methods by which equating or linking is established. 

GUIDELINE 2-6-7Q: Users should be warned that scores are not directly 
comparable when converted scores from two versions of the test are not strictly 
equivalent. 


STANDARD 2-6-8: Reporting — Test results of the testing should be provided with 
sufficient detail and contextual information to understand the inferences that can and 
cannot be made from them. 

GUIDELINE 2-6-8A: The analysis of item responses or test scores should be 
described in detail, including procedures for scaling or equating. 

GUIDELINE 2-6-8B: Appropriate interpretations of all reported scores should be 
provided. The interpretations should describe what the test covers, what the scores 
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mean, and the precision of the scores. The generalizability and limitations of 
reported scores should also be presented. Potential users should be cautioned 
against unsupported interpretations; that is, interpretations of scores that have not 
been investigated, or interpretations of scores inconsistent with available evidence. 

GUIDELINE 2-6-8C: Validity and reliability should be reported for the level of 
aggregation for which the scores are reported when matrix sampling is used. Scores 
should not be reported for individuals unless the validity, comparability, and 
reliability of such scores indicate that reporting individual scores is meaningful. 


STANDARD 2-6-9: Manual — All evidence of compliance with the standards set forth 
above for each test instrument used in NCES surveys must be compiled in a manual. 

GUIDELINE 2-6-9A: Technical documentation should provide technical and 
psychometric infonnation on a test as well as information on test administration, 
scoring, and interpretation. 
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SUBJECT: COVERAGE FOR NCES FRAMES AND SAMPLES 
NCES STANDARD: 3-1 

PURPOSE: To ensure that necessary steps are taken to develop and maintain data 
collections that are used as sampling frames, and that coverage of sampling frames is 
evaluated and documented. 

KEY TERMS: capture/recapture, confidentiality, coverage, coverage error, dual- 
frame estimation, estimation, frame, frame population, freshening, half-open interval, 
multiplicity estimation, noncoverage, overcoverage, supplemental area frame, survey, 
survey system, target population, undercoverage, and variance. 


STANDARD 3-1-1: Staff responsible for NCES data collections that serve as sampling 
frames for other NCES surveys must evaluate the coverage of the frame and document 
coverage rates at least once every 5 years. 

GUIDELINE 3-1-1A: Frames can be retrospectively compared against alternative 
frames found inside and outside of the Department of Education, considering total 
list count comparisons, matching operations, and dual-frame estimation procedures 
using capture/recapture procedures to estimate noncoverage, and providing an 
estimation of missing units. 

GUIDELINE 3-1-1B: Staff responsible for NCES data collections that are used as 
sampling frames should maintain two-way communications with survey staff who 
use their collection as a frame. Procedures such as sharing preliminary data files 
with survey staff in order to develop frames may be instituted. (For example, staff 
that use an administrative list of public schools for their frames should be alerted 
when new data are available and each time there is a major change in the list.) 


STANDARD 3-1-2: NCES data collections that are used as sampling frames for other 
NCES surveys must strive for coverage rates in excess of 95 percent overall and for 
each major stratum. 


STANDARD 3-1-3: Staff using NCES frames for sample surveys must be cognizant 
of coverage issues and must take the steps necessary to provide satisfactory coverage 
for the sample survey. If there is not evidence of a coverage rate of at least 85 percent 
of the target population, then frame enhancements such as frame supplementation or 
dual-frame estimation must be incorporated into the survey study design. 

GUIDELINE 3-1-3A: The first time a survey is conducted, background design 
and coverage work should be done before choosing the frame. Alternative frames, 
if applicable, should be considered and compared. 
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GUIDELINE 3-1-3B: Coverage errors such as over- and undercoverage, bad 
contact information, classification, temporal errors, and other listing errors should 
be minimized before the use of a frame. Techniques such as list supplements, 
multiciplicity estimation, half-open intervals, and un-duplication can be used to 
reduce these errors and improve coverage of the frame. 

GUIDELINE 3-1-3C: Any possible changes to frame variables identified by 
sample survey staff should be reported to the staff responsible for the data 
collection being used as the frame. For example, the relevant variables to maintain 
and consider include (1) eligibility (e.g., grade span); (2) contact information (e.g., 
name, address, and phone number); (3) classification variables (e.g., state and 
school level); and (4) measures of size (e.g., grade enrollment). 

GUIDELINE 3-1-3D: To reduce coverage error, whenever a frame has important 
deficiencies with respect to the measurement unit, dual-frame estimation should be 
considered to correct these deficiencies. Since dual-frame estimation can be 
expensive, the effect dual-frame estimation has on increasing the variance estimates 
should also be considered when deciding to use dual-frame estimation. 


STANDARD 3-1-4: For each sample survey, a description of the frame and its 
coverage must be included in the survey documentation. This description must include, 
but is not limited to, the target and frame populations (and exclusions thereof); the 
name and date of the data collection which provided the original frame; any 
supplementing done to the original frame; limitations of the frame including the 
timeliness of the frame; and, if applicable, an estimation of the missing units on the 
frame. 

GUIDELINE 3-1-4A: Sample survey documentation should include a discussion 
of coverage issues such as alternative frames that were considered, what was done 
to improve the coverage of the frame, and how data quality and item nonresponse 
on the frame may have affected the coverage of the frame. 

GUIDELINE 3-1-4B: Survey documentation should include any estimation 
techniques used to improve the coverage of estimates. This would include post- 
stratification procedures. (For example, a telephone survey could post-stratify 
estimates of all individuals to account for the exclusion of those without 
telephones.) 

GUIDELINE 3-1-4C: NCES survey staff should archive their survey’s sampling 
frames as part of the documentation of the survey system found in Standard 3-4, 
taking security precautions consistent with confidentiality laws into account. This 
archiving may be particularly important if a preliminary file was used to develop 
the frame, or if there is a chance that the frame may be used in the future to 
further develop research questions. 
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STANDARD 3-1-5: NCES survey staff that use NCES data collections as a frame 
must share any coverage or usage issues with the NCES data collection staff so that the 
coverage can be improved for future uses. This standard is related to Guidelines 3-1- 
3B and 3-1-3C. (For example, after the survey is complete, the survey staff should 
provide a memo to the NCES data collection staff for the data collection used as a 
frame, reviewing the major limitations of the coverage or the data quality issues 
identified.) 
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SUBJECT: ACHIEVING ACCEPTABLE RESPONSE RATES 


NCES STANDARD: 3-2 

PURPOSE: To ensure that data collection programs conducted by or for the NCES are 
conducted in a manner that protects the rights of survey respondents to fair treatment 
and privacy, while at the same time encouraging high rates of response across all strata, 
since high response rates help ensure that results are representative of the target 
population. 

K E Y TERMS: confidentiality, imputation, item nonresponse, longitudinal, 
nonresponse bias, pretest, required response items, response rate, strata, survey, and 
target population. 


STANDARD 3-2-1: The data collection must be designed and administered in a 
manner that protects the rights of the survey respondents, while encouraging 
respondents to participate. 

GUIDELINE 3-2-1A: The method of data collection (e.g., mail, telephone, 
Internet) should be appropriate for the target population and the objectives of the 
data collection. 

GUIDELINE 3-2-1B: The data should be collected at the most appropriate time of 
year. 

GUIDELINE 3-2-1C: The data collection period should be of adequate and 
reasonable length to achieve good response rates. 

GUIDELINE 3-2-1D: When appropriate, respondent incentives should be 
considered. 


STANDARD 3-2-2: An explanation of the need for data, the goals and objectives of 
the data collection, and examples of uses of the data that benefit respondents must be 
provided to the respondent (Privacy Act of 1974, as amended, 5 U.S.C. 552a). 

GUIDELINE 3-2-2A: The materials describing the data collection should be sent 
to respondents in advance, when possible. 

GUIDELINE 3-2-2B: For interviewer-administered data collection programs, 
training should emphasize techniques for obtaining respondent cooperation and 
techniques for building rapport with respondents, including respect for respondents’ 
rights, manner, follow-up skills, knowledge of the goals and objectives of the data 
collection, and knowledge of the uses of the data. 
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GUIDELINE 3-2-2C: Prior to conducting a data collection program, 

endorsements, support, and the active cooperation of interested groups, such as 
professional organizations, professional associations, education community leaders, 
and state and local school district officials, should be obtained and communicated to 
respondents. 


STANDARD 3-2-3: All NCES data collections must provide information concerning 
the confidentiality of responses. Privacy and confidentiality assurances citing the 
appropriate legislation must be provided, as applicable (see Standard 4-2). 


STANDARD 3-2-4: In keeping with the goals of the particular data collection effort, 
respondent burden must be minimized, as required by the Office of Management and 
Budget clearance process. 

GUIDELINE 3-2-4A: The questionnaire should be pretested for the difficulty and 
interpretability of questions. 

GUIDELINE 3-2-4B: The questionnaire should be pretested for ease in navigation 
of self-administered questionnaires. 

GUIDELINE 3-2-4C: Questions should be clearly written and skip patterns easily 
followed. 

GUIDELINE 3-2-4D: The questionnaire should be of reasonable length. 


STANDARD 3-2-5: All data collection programs require some follow-up of 
nonrespondents to achieve desirable response rates. Follow-up strategies designed to 
protect the respondents’ rights, while achieving acceptable response rates, must be 
included in the data collection plan. 

GUIDELINE 3-2-5A: Internal reporting systems that provide timely reporting of 
response rates and the reasons for nonresponse throughout the data collection 
should be developed. These systems should be flexible enough to identify important 
subgroups with low response rates for more intensive follow-ups. 

GUIDELINE 3-2-5B: For longitudinal surveys, provide appropriate confidentiality 
assurances, while obtaining as much locating information about respondents as 
possible during initial contact (e.g., for a student, school address, home address, 
name of advisor, phone numbers of parents). 

GUIDELINE 3-2-5C: If response rates are low after the initial phases of data 
collection, and if further data collection on the full sample is deemed too costly, 
take a random subsample of nonrespondents and use a more intensive data 
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collection method. This subsample will permit a description of nonrespondents' 
characteristics, provide data needed for nonresponse bias analysis, and allow for 
possible weight adjustments or for imputation of missing characteristics. 

GUIDELINE 3-2-5D: Determine a set of required response items to obtain when a 
respondent is unwilling to fully cooperate. These items may then be targeted in 
follow-up to meet the minimum standard for unit response. These items may also 
be used in a nonresponse bias analysis that compares characteristics of respondents 
and nonrespondents using the sample data for those items. These required response 
items may also be used for item nonresponse imputation systems. 
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SUBJECT: MONITORING AND DOCUMENTING SURVEY CONTRACTS 


NCES STANDARD: 3-3 

PURPOSE: To assist NCES staff in monitoring and documenting survey contract 
activities. 

KEY TERMS: edit, estimation, imputation, response rate, stage of data collection, 
survey, survey system, and variance. 


STANDARD 3-3-1: The Contracting Officer’s Representative (COR) must work to 
ensure that the contractor meets (a) contract specifications, (b) contract schedules, (c) 
NCES standards, (d) performance cost controls, and (e) beneficial effort/method of 
perfonnance criteria in fulfilling the contract. Education Department Directive C: 
GPA:2-105 dated 6/15/92 established the Standards and Guidelines for the Monitoring 
of Contracts. 

In some instances, the contractor may request technical redirection for unanticipated 
problems. For simple matters that are clearly within the scope of the contract, such 
requests may be made verbally. For problems that may require a change in scope, all 
requests must be in writing and outline the issue(s) and potential options. The COR 
must use this information in discussions with other NCES senior management in 
determining the appropriate course of action. All changes in any contract scope of 
work require action by the Contracting Officer. Whatever course of action is taken, it 
must be documented and placed in the project files. 

GUIDELINE 3-3-1A: The COR should maintain close communication with the 
contractor. Depending on the nature of the survey, the COR should maintain 
communication through the use of meetings, phone calls, e-mails, visits, and/or the 
electronic management information system (MIS) for the purpose of tracking and 
monitoring the progress of the survey. 

GUIDELINE 3-3-1B: The COR should review and verily progress reports, 
vouchers, technical products and documentation, written correspondence, and other 
documents for the following purposes: 

1 . Monitoring adherence to project schedules and requirements; 

2. Assuring deliverables meet NCES standards and comply with the conditions of 
the contract and other quality requirements (e.g., accuracy and completeness); 
and 

3. Identifying potential problems that would substantially affect the successful 
completion of the survey or alter the terms and conditions of the contract (e.g., 
cost or time increases, quality decreases). 
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GUIDELINE 3-3-1C: The status of each unit of observation should be kept 
current and available to the COR at each stage of the data collection process. 
Critical status events may include, but are not limited to, dates of questionnaire mail 
out, returns, deletions (out-of-scopes), scan editing, data entry, machine editing, 
callback(s), and addition to the final data files. The COR should request direct and 
rapid access to the information. 

GUIDELINE 3-3-1D: To help decide whether any adjustments or corrective 
actions are needed, soon after initial startup of field operations, and less frequently 
thereafter, the COR should evaluate the quality of survey operations by comparing a 
sample of the original returned questionnaires with the information on the data file 
for the following purposes: 

1 . Detect any data processing errors; 

2. Learn of any problems with reporting or questionnaire design; and 

3. Ensure that editing/update procedures are being correctly implemented. 

GUIDELINE 3-3-1E: On an as-needed basis, CORs may request a copy of 
“completed” records from the current master file (sometimes referred to as a “pull”) 
and analyze the infonnation for conformance to contract requirements. The extent 
of the statistical analysis of a pulled database should vary with survey objectives. 
Simple cross-tabulations and frequencies of discrete variables should normally 
point out internal coding inconsistencies and also provide interim item response 
rates. Simple descriptive statistics for continuous variables should provide interim 
item response rates, measures of dispersion, and outliers. 

GUIDELINE 3-3-1F: The COR should ensure that software used for weighting, 
imputations, and variance estimation is accurate. This may be done through a series 
of "checkpoints" imbedded within the program(s). Another alternative is to have the 
contractor provide printouts from a series of discrete steps with review by the COR 
along the way. 

GUIDELINE 3-3-1G: The COR should keep the CO and NCES management 
informed of the result of reviews. As an integral part of this work, the COR should 
offer recommendations for solving any problems, acceptance of deliverables, 
perfonnance awards, and approval or disapproval of any proposed changes. 


STANDARD 3-3-2: The COR must maintain the following documents in the COR 
contract file: (a) progress reports, (b) vouchers, and (c) deliverables as required by the 
contract. Together with the RFP, contract proposal, proposal evaluation, and signed 
contract, these documents are subject to audit. Also document any modifications or 
changes in (a) key personnel, (b) project schedule, (c) deliverables, and (d) scope of 
work, and their implications for the project completion date, deliverables, and costs. 



GUIDELINE 3-3-2A: It is advisable to include in the contract file all 
correspondence, such as logs of phone conversations, e-mail and written 
correspondence, and documentation, describing the approval of or decisions made 
regarding changes. 

GUIDELINE 3-3-2B: The COR should keep accurate and complete records of 
contractor perfonnance, such as lateness, unacceptable deliverables, and cost 
overrun. Actions or decisions taken by the COR or CO to remedy the problems 
should also be clearly documented. 


STANDARD 3-3-3: CORs should require that all computer programs 
(software) be self-documenting. 

GUIDELINE 3-3-3 A: The programmer should insert "comments" within 
the program(s) to describe each discrete section of code. Relationships 
between programs and data files should be flowcharted or described in a 
separate document. This includes record layouts and file structures. 


STANDARD 3-3-4: Upon completion and/or tennination of the contract, the COR 
must archive those items specified in the Standard for Documenting a Survey System 
(3-4) and Standard for Survey Documentation in Reports (7-2). 
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SUBJECT: DOCUMENTING A SURVEY SYSTEM 


NCES STANDARD: 3-4 

PURPOSE: To ensure that complete documentation is kept on NCES surveys and 
survey systems and their associated contract deliverables. Documentation includes 
those materials necessary to understand how to properly analyze data from each survey, 
as well as the information necessary to replicate and evaluate each survey. In addition, 
survey system documentation includes information necessary to design and estimate 
resource requirements of future similar surveys. 

K E Y TERMS: coverage, edit, frame, imputation, instrument, nonsampling error, 
public-use data file, response rate, sampling error, strata, survey, survey system, and 
variance. 


STANDARD 3-4-1, Survey system documentation must include all infonnation 
necessary to properly analyze the data. This information shall, at a minimum, include 
the following: 

1. Final data set(s); 

2. Final instrument s) or a facsimile thereof; 

3. Definitions of all variables; 

4. Data file layout; 

5. Descriptions of constructed variables on the data file that are computed from 
responses to other variables on the file; 

6. Description of variables used to uniquely identify cases in the data file; 

7. Description of sample weights and how to apply them; 

8. Description of the strata and primary sampling unit (PSU) identifiers to be used for 
analysis; 

9. Description of how to calculate variances appropriate for the survey design; 

10. Description of all imputation methods applied to the data and how to remove 
imputed values from the data; and 

11. Descriptions of kn own data anomalies and corrective actions. 

GUIDELINE 3-4-1A: If the data are collected through a web-based collection or 
through a CATI or CAPI interview, the following infonnation should be included in 
the documentation of the final instruments: 

1. All items in the instrument (e.g., questions, check items, and help screens); 

2. Items extracted from other data files to pre-fill the instrument (e.g., dependent 
data from a prior round of interviewing); and 
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3. Items that are input to the post data collection processing steps (e.g., output of 
an automated instrument). 


STANDARD 3-4-2: To insure that a survey can be replicated and properly evaluated, 
the survey system documentation must, at a minimum, include the following: 

1 . Justifications for the items on the survey instrument, including how the final items 
were selected; 

2 . All instructions to respondents and/or interviewers either about how to properly 
respond to a survey item or how to properly present a survey item; 

3 . Description of the data collection methodology; 

4 . Sampling plan and justifications for why it was implemented, and, if possible, the 
final sample frame; 

5. Selected sample; 

6. Description of the magnitude of sampling error associated with the survey, and how 
it was calculated; 

7. Description of the sources of nonsampling error associated with the survey (e.g., 
coverage, measurement); 

8. Unit response rates (weighted and unweighted); 

9. Overall response rates (weighted and unweighted); 

10. Item response rates; and 

11. Total response rates. 

GUIDELINE 3-4-2A: The survey system documentation should also include the 
following: 

1. Final weighting plan specifications, including calculations for how the final 
weights were derived, and justifications for why it was implemented; 

2 . Final imputation plan specifications and justifications for why it was 
implemented; 

3 . Data editing plan specifications and justifications for why it was implemented; 
and 

4 . Data processing plan specifications and justifications for why it was 
implemented; 

GUIDELINE 3-4-2B: Where appropriate, methods for bounding or estimating the 
nonsampling error from each source identified in the evaluation plan should be 
developed and implemented. 
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GUIDELINE 3-4-2C: Where possible, nonsampling error estimates and bounds 
should make use of data from other surveys or from administrative records or 
censuses, taking into account the limitations of the external data. 

GUIDELINE 3-4-2D: For recurring surveys, a quality profile report that itemizes 
all sources of identified error should be produced. Where possible, estimates or 
bounds on the magnitudes of these errors should be provided; the total error model 
for the survey should be discussed; and the survey should be assessed in terms of 
this model. 


STANDARD 3-4-3: To insure that NCES has sufficient information to design future 
surveys and to accurately estimate their resource requirements, survey system 
documentation must include the following: 

1. All information gennane to the contractual operation of the survey, including the 
request for proposals used to solicit the contract(s); 

2. Independent government cost estimate; 

3. Contract(s) used to develop, conduct, and report on the survey; 

4. Any modifications to the contract(s); 

5. Final contract deliverables, progress reports, and vouchers; and 

6. Office of Management and Budget (OMB) clearance package and correspondence 
with OMB about survey clearance. 


STANDARD 3-4-4: At a minimum, survey documentation must be stored 
electronically in a fonnat that can be viewed without proprietary software. Final data 
sets shall be stored in ASCII format. Additional copies in other formats are allowed, 
but ASCII versions are required. In addition, substantive reports written to release the 
data shall also be stored, at a minimum, in the format originally used to produce the 
report, and PDF or ASCII (see Standard 7-1). 


STANDARD 3-4-5: All reports, documentation, and public-use data files must be 
stored on the web, a CD-ROM, or an NCES dedicated server. Restricted data files and 
associated documentation must be transmitted to the Statistical Standards Program for 
secure storage. 
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PROCESSING AND EDITING OF DATA 


4-1 Data Editing and Imputation of Item 
Nonresponse 

4-2 Maintaining Confidentiality 
4-3 Evaluation of Surveys 
4-4 Nonresponse Bias Analysis 
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SUBJECT: DATA EDITING AND IMPUTATION OF ITEM NONRESPONSE 


NCES STANDARD: 4-1 

PURPOSE: To establish guidelines to reduce potential bias, ensure consistent 
estimates, and simplify analysis, by substituting values for missing (i.e., imputation) or 
inconsistent data in a data set (i.e., edits). 

KEY TERMS: cross-sectional, cross-sectional imputations, cross-wave imputations, 
edit, freshened sample, imputation, item nonresponse, key variables, longitudinal, 
nonresponse bias, response rate, stage of data collection, and universe. 


STANDARD 4-1-1: Prior to imputation the data must be edited. Data editing is an 
iterative and interactive process that includes procedures for detecting and correcting 
errors in the data. Data editing must be repeated after the data are imputed, and again 
after the data are altered during disclosure risk analysis. At each stage, the data must be 
checked for 

1. Credibility based on range checks to detennine if ah responses fall within a 
prespecified reasonable range; 

2. Consistency based on checks across variables within individual records for 
noncontradictory responses and for correct flow through prescribed skip patterns; 
and 

3. Completeness based on the amount of nonresponse and involves efforts to fill in 
missing data directly from other portions of an individual’s record. 


STANDARD 4-1-2: Key variables in data sets used for cross-sectional estimates must 
be imputed (beyond overall mean imputation). This applies to cross-sectional data sets 
and to data from longitudinal data sets that are used to produce cross-sectional 
estimates (i.e., base year and subsequent freshened samples). (See appendix B for a 
discussion of alternative imputation procedures, including the pros and cons of specific 
approaches). 

GUIDELINE 4-1-2A: In census (universe) data collections, it may not be 
appropriate to impute data in certain situations (e.g., peer analysis situations or 
when data for a particular establishment — school, university, or library — are being 
examined individually). 

GUIDELINE 4-1-2B: When using non-NCES data sets, it is desirable to impute 
for missing data in those items being used in NCES publications. This is only 
appropriate when adequate auxiliary infonnation is available. 

GUIDELINE 4-1-2C: Imputation procedures should be internally consistent, be 
based on theoretical and empirical considerations, be appropriate for the analysis, 
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and make use of the most relevant data available. If multivariate analysis is 
anticipated, care must be taken to use imputations that minimize the attenuation of 
underlying relationships. The Chief Statistician should review imputation plans 
prior to implementation. 


STANDARD 4-1-3: In the case of longitudinal data sets, two imputation approaches 
are acceptable: cross-wave imputations or cross-sectional imputations. Cross-wave 
imputations may be used to complete missing data for longitudinal analysis or cross- 
sectional imputations may be used. (Guideline 4-1-2C of this Standard applies here, as 
well.) 


STANDARD 4-1-4: In those cases where a nonresponse bias analysis shows that the 
data are not missing at random, the amount of potential bias must inform the decision to 
retain or delete individual items (see Standard 4-4). 


STANDARD 4-1-5: In cases where imputation is not used (e.g., items that are not key 
variables in either cross-sectional or longitudinal analysis), data tables must include a 
reference to a methodology table or glossary that shows the actual weighted response 
rates for each unimputed variable included in the report (see Standard 1-3 for the item 
response rate formula). For individual variables with item response rates less than 85 
percent, the variable must be footnoted in the row or column header. The footnote must 
alert readers to the fact that the response rate is below 85 percent and that missing data 
have not been explicitly accounted for in the data. 


STANDARD 4-1-6: When imputations are used, documentation indicating the 
weighted proportion of imputed data must be presented for all published estimates 
based on NCES data. Information about the amount of imputed data in the analysis can 
be included in the technical notes and does not have to accompany each table. The 
range of the amount of imputation used for the set of items included in an analysis must 
be reported. Also, the amount of imputation must be reported for items with response 
rates less than 70 percent. Items with response rates lower than 70 percent must be 
footnoted in the tables. 


STANDARD 4-1-7: All imputed values on a data file must be clearly identified as 
such. 

GUIDELINE 4-1-7A: Imputed data should be flagged in associated “flag” fields. 
The imputation method should be identified in the flag. Blanks are not legitimate 
values for flags. 
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STANDARD 4-1-8: If nonimputed items are used in the estimation of totals or ratios 
(as in Standard 4-1-3 above), the risks of not using imputed data must be described. 

1. Estimated totals using nonimputed data implicitly impute a zero value for all 
missing data. These zero implicit imputations will mean that the estimates of totals 
will underestimate the true population totals. Thus, when reporting totals based on 
a nonimputed item, the response rate for that item must be footnoted in the data 
table. 

2. Ratios (averages) using nonimputed data will implicitly impute the cell ratio for all 
missing data within the cell. This can cause inconsistencies in the estimates 
between tables. 


73 



SUBJECT: MAINTAINING CONFIDENTIALITY 


NCES STANDARD: 4-2 

PURPOSE: To protect the confidentiality of NCES data that contain information about 
individuals (individually identifiable information). For this reason, staff must be 
cognizant of the requirements of the law and must monitor the confidentiality of 
individually identifiable information in their daily activities and in the release of 
infonnation to the public. 

KEY TERMS: coarsening, confidentiality, confidentiality edits, Data Analysis System 
(DAS), data swapping, edits, disclosure risk analysis, individually identifiable data, 
perturbation techniques, public-use data file, public-use edits, restricted-use data file, 
stage of data collection, and statistical disclosure techniques. 

LEGAL REQUIREMENTS: Four laws cover protection of the confidentiality of 
individually identifiable information collected by NCES — the Privacy Act of 1974, as 
amended; the E-Government Act of 2002; the Education Sciences Refonn Act of 2002; 
and the USA Patriot Act of 200 1 . 

Privacy Act of 1974, as amended — “The purpose of this Act is to provide certain 
safeguards for an individual against invasion of personal privacy by requiring Federal 
agencies... to collect, maintain, use or disseminate any record of identifiable personal 
infonnation in a manner that assures that such action is for necessary and lawful 
purpose, that the information is current and accurate for its intended use, and that 
adequate safeguards are provided to prevent misuse of such information." A willful 
disclosure of individually identifiable data is a misdemeanor, subject to a fine up to 
$5,000. 

E-Government Act of 2002, Title V, Subtitle A, Confidential Information Protection 
(CIP 2002) — Under this law, all individually identifiable information supplied by 
individuals or institutions to a federal agency for statistical purposes under the pledge 
of confidentiality must be kept confidential and may only be used for statistical 
purposes. Any willful disclosure of such information for nonstatistic al purposes, 
without the informed consent of the respondent, is a Class E felony. 

Education Sciences Reform Act of 2002 (ESRA 2002) — Under this law, all individually 
identifiable infonnation about students, their families, and their schools shall remain 
confidential. To this end, this law requires that no person may 

a. Use any individually identifiable infonnation furnished under the provisions of this 
section for any purpose other than statistical purposes for which it is supplied, 
except in the case of terrorism (see discussion of the Patriot Act); 

b. Make any publication whereby the data furnished by any particular person under 
this section can be identified; or 



c. Permit anyone other than the individuals authorized by the Commissioner to 
examine the individual reports. 

Further, individually identifiable information is immune from legal process, and shall 
not, without the consent of the individual concerned, be admitted as evidence or used 
for any purpose in any action, suit, or other judicial or administrative proceeding, 
except in the case of terrorism. Employees, including temporary employees, or other 
persons who have sworn to observe the limitations imposed by this law, who 
knowingly publish or communicate any individually identifiable infonnation will be 
subject to fines of up to $250,000, or up to 5 years in prison, or both (Class E felony). 

USA Patriot Act of 2001 — This law permits the Attorney General to petition a court of 
competent jurisdiction for an ex parte order requiring the Secretary of the Department 
of Education to provide data relevant to an authorized investigation or prosecution of an 
offense concerning national or international terrorism. The law states that any data 
obtained by the Attorney General for these purposes “...may be used consistent with 
such guidelines as the Attorney General, after consultation with the Secretary, shall 
issue to protect confidentiality.” This law was incorporated into ESRA 2002. 

Federal Statistical Confidentiality Order of 1997 — This OMB Order provides a 
consistent government policy for “...protecting the privacy and confidentiality interests 
of persons who provide information for Federal statistical programs...” The Order 
defines relevant terms and provides guidance on the content of confidentiality pledges 
that Federal statistical programs should use under different conditions. The Order 
provides language for confidentiality pledges under two conditions — first, when the 
data may only be used for statistical purposes; second, when the data are collected 
exclusively for statistical purposes, but the agency is compelled by law to disclose the 
data. Since the USA Patriot Act of 2001 includes a legal requirement that compels 
NCES to share the data under the conditions specified in the law (see above), the 
second condition applies to NCES. In this case, the Order instructs the agency to “...at 
the time of collection, inform the respondents from whom the information is collected 
that such information may be used only for statistical purposes and may not be 
disclosed, or used, in identifiable form for any other purpose, unless otherwise 
compelled by law.” 


STANDARD 4-2-1: All NCES staff, without exception, must pledge not to release any 
individually identifiable data, for any purpose, to any person not sworn to the 
preservation of confidentiality. Individually identifiable data are confidential and 
individually identifiable data are protected from legal process unless the individual 
provides written consent, except in the case of the authorized investigation and 
prosecution of terrorism. 


STANDARD 4-2-2: All contractors whose activities might involve contact with 
individually identifiable information must provide NCES Project Officers with a list of 
all staff who might have contact with such data; all such staff must have a signed 
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notarized affidavit of nondisclosure on file at NCES. These affidavits and the staff list 
must be kept current as staff members leave and as new staff members are assigned to 
NCES projects with individually identifiable infonnation. 


STANDARD 4-2-3: All contractor staff with access to individually identifiable 
infonnation must only use that information for purposes associated with the data 
collection and analysis specified in the contract. 


STANDARD 4-2-4: Respondents must be told in a cover letter or in instructions that 
“Your answers may be used only for statistical purposes and may not be disclosed, or 
used, in identifiable fonn for any other purpose except as required by law.” 
Furthermore, the routine statistical purposes for which the data may be used must be 
explained. 


STANDARD 4-2-5: All materials having individually identifiable data must be kept 
secure at all times through the use of passwords, physical separation of individual 
identity from the rest of the data, and secure data handling and storage. (See the 
Restricted-Use Data Procedures Manual, 2000.) 


STANDARD 4-2-6: When confidentiality edits (that are performed using perturbation 
techniques) are used for a data file, they must be applied to all analytical files (e.g., 
public-use files, DAS files, and restricted-use files) derived from that data file. 


STANDARD 4-2-7: NCES distributes Data Analysis Systems (DAS) that produce 
tabular estimates from restricted-use files. In this case, the following conditions must be 
met: 

1. NCES may not release the exact sample size for restricted-use data files that are 
distributed through a DAS. 

2. Only restricted-use data files with Disclosure Review Board (DRB) approved 
confidentiality edits may be used to produce a DAS. 

3. A DAS may not publish unweighted counts. 

The confidentiality protection required in a DAS is a function of the type of estimate(s) 
to be produced. For example, a DAS that produces cell counts may require the use of 
more extensive confidentiality edits. 

If a public-use file is released or planned for a data file, any DAS created for that data 
file must be based on public-use data or restricted-use data that have undergone 
perturbation disclosure limitation techniques as part of confidentiality edits. 
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STANDARD 4-2-8: For public-use data files, NCES minimizes the possibility of a 
user matching outliers or unique cases on the file with external (or auxiliary) data 
sources. Because public-use files allow direct access to individual records, perturbation 
and coarsening disclosure limitation techniques may both be required. The perturbation 
disclosure limitation techniques, by definition, include the techniques applied in a 
confidentiality edit (if one is performed) and may include additional perturbation 
disclosure limitation techniques as well. 


Methods for protecting individually identifiable data 


Type of protection 

Methods 

Perturbation 

Coarsening 

Confidentiality edit 

Yes 

Yes 

Disclosure limitation techniques 

Yes 

Yes 


All public-use files (i.e., the edited restricted-use files) that contain any potentially 
individually identifiable information must undergo a disclosure risk analysis in 
preparation for release to the public. The steps are as follows: 

1. At an early stage in designing and conducting this analysis, staff must consult the 
Disclosure Review Board (DRB) for guidance on disclosure risk analysis and on the 
use of NCES disclosure risk software. Any modifications that are necessary as a 
result of the analysis must be made, and the entire process must be documented. 

2. The documentation of the disclosure risk analysis must be submitted to the DRB. 
The documentation must include descriptions of the risk of disclosure and the types 
of edits used to avoid disclosure. Decisions over the type of confidentiality edits 
must take into account the procedures needed to avoid disclosure of individually 
identifiable information, age of the data, accessibility of external files, detail and 
specificity of the data, and reliability and completeness of any external files. The 
documentation should also include the results demonstrating the disclosure risk 
after adjustments to the data. 

3. The DRB will review the disclosure risk analysis report and make a 
recommendation to the Commissioner of NCES about the file release. 

4. The Commissioner then rules on the release of the data file. 


STANDARD 4-2-9: Inasmuch as confidentiality edits are intended to protect 
individually identifiable data, files that incorporate the results of the DRB approved 
confidentiality edit plan may be used to produce tables without confidentiality concerns 
over minimum cell sizes. When this is done: 
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1. All versions of a data file must reflect the same confidentiality edits. Staff must 
consult the DRB on the confidentiality plan, data file dissemination plan (restricted, 
public use, and/or DAS), and disclosure risk analysis plan, concurrently. 

2. Documentation of the confidentiality edit must be included along with the 
documentation of the disclosure risk analysis that is submitted to the DRB. 


STANDARD 4-2-10: A survey program may decide not to apply confidentiality edits 
(i.e., perturbation disclosure limitation techniques) to a restricted-use file (and the 
associated public -use fde). In this situation, when tabulations are produced, any table 
with a cell with 1 or 2 unweighted cases must be recategorized to insure that each cell 
in the table has at least 3 unweighted cases. This restriction also applies to 
documentation for public-use files. This rule excludes table cells with zero cases 
because there are no data to protect in the cell. 

EXAMPLE: A principal salary table by race and years of experience may only 
have 2 Asian respondents with more than 20 years of experience. To implement 
this standard, one possibility would be to either combine the Asian category 
with another race group or combine the 20+ years of experience category with 
the next lower experience category. This process would continue until all cells 
have either at least 3 unweighted cases or no unweighted cases. 


STANDARD 4-2-11: At the discretion of the Commissioner of NCES, data security 
staff may release individually identifiable data to persons for statistical uses compatible 
with the purposes for which the data were collected. Persons receiving individually 
identifiable data from NCES shall execute a restricted-use data license agreement, sign 
affidavits of nondisclosure, and meet such other requirements as deemed necessary in 
accordance with other confidentiality provisions of the law. 


STANDARD 4-2-12: Before external data users may gain access to public-use data 
files, they must agree that they will not use the data to attempt to identify any individual 
whose data is in the file. This may be accomplished by using the following wording: 

“WARNING” 

Under law, public use data collected and distributed by the National 
Center for Education Statistics (NCES) may be used only for statistical 
purposes. 

Any effort to determine the identity of any reported case by public-use 
data users is prohibited by law. Violations are subject to Class E felony 
charges of a fine up to $250,000 and/or a prison term up to 5 years. 

NCES does all it can to assure that the identity of data subjects cannot be 
disclosed. All direct identifiers, as well as any characteristics that might 
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lead to identification, are omitted or modified in the dataset to protect 
the true characteristics of individuals. Any intentional identification or 
disclosure of a person violates the assurances of confidentiality given to 
the providers of the information. Therefore, users shall: 


• Use the data in this dataset for statistical purposes only. 

• Make no use of the identity of any person discovered 
inadvertently, and advise NCES of any such discovery. 

• Not link this dataset with individually identifiable data from 
other NCES or non-NCES datasets. 

• To proceed you must signify your agreement to comply with the 
above-stated statutorily based requirements. 


REFERENCE 


U.S. Department of Education, Office of Educational Research and Improvement, 
National Center for Education Statistics. (2000). Restricted-Use Data Procedures 
Manual. Washington, DC: U.S. Government Printing Office. 
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SUBJECT: EVALUATION OF SURVEYS 


NCES STANDARD: 4-3 

PURPOSE: To provide the necessary information for users of the survey data to 
understand the quality and limitations of the data and to provide information for 
planning future surveys or replications of the same survey. The evaluation should also 
include a systematic assessment of all sources of error for key statistics that will be 
studied or reported in NCES publications. 

KEY TERMS: coverage error, edit, estimation, field test, frame, imputation, item 
nonresponse, key variables, longitudinal, nonsampling error, overcoverage, pretest, 
response rate, sampling error, stage of data collection, survey, survey system, 
undercoverage, unit nonresponse, and variance. 


STANDARD 4-3-1: All proposed and ongoing surveys conducted by NCES must 
include an evaluation component in the survey design plan. The survey evaluation must 
include the following: 

1 . Range of potential sources of error; 

2. Measurement of the magnitude of sampling error and sources of the various types 
of nonsampling error expected to be a problem; 

3. Studies that identify factors associated with differential levels of error and assess 
procedures for reducing the magnitude of these errors; 

4. Assessment of the quality of the final estimates, including comparisons to external 
sources, and where possible, comparisons to prior estimates from the same data 
collection; and 

5. Technical report or series of technical reports summarizing results of evaluation 
studies; for example, a quality profile or total survey error model. 

GUIDELINE 4-3-1 A: Review past surveys similar to the one being planned to 
detennine what statistical evaluation data have been collected in prior surveys and 
any potential problems that have been identified. Based on this review, prepare a 
written summary of what is known about the sources and magnitude of error. 

GUIDELINE 4-3-1B: Indicate how each issue will be addressed, including the 
identification of required data internal and external to the study, a discussion of the 
comparisons that could be made, the experiments that may be built into the survey, 
and evaluation methods. 

GUIDELINE 4-3-1C: Watch for additional problem areas arising during the 
course of the survey and, where possible, collect and analyze appropriate data to 
assess the magnitude of the problem. 
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GUIDELINE 4-3-1D: Analyze data from the survey evaluation prior to or 
concurrent with the analysis of the survey data so that the results of the evaluation 
can be taken into account when processing, analyzing, and interpreting the study 
data. 

GUIDELINE 4-3-1E: List 4-3-A may be used to help guide the development of 
evaluation plans during the survey planning stage and to develop a monitoring 
system for possible problems that may emerge during data collection and 
processing. The list identifies five categories of errors and enumerates potential 
sources of error within each category, methods to measure or evaluate them, and 
possible modifications for correcting them. 


81 



LIST 4-3-A: MEASURING AND EVALUATING ERROR 


1. SAMPLE SELECTION, FRAMES AND COVERAGE— ADEQUACY OF 
FRAME 

A. Sources of error: 

1. Limitations of the frame — undercoverage/overcoverage of schools or 
institutions, duplicates, cases of unknown eligibility; 

2. Listing error — failure of initial respondents to include or exclude 
prospective respondents per instruction; and 

3. Selection of sampling units and respondent units within sampling units. 

B. Evaluation of survey coverage — examples: 

1 . Comparison of estimated counts to reliable independent sources; 

2. Matching studies to earlier versions of the same data source or to other data 
sources and the use of dual system estimation; 

3. Analysis of survey returns for deaths, duplicates, changes in classification, 
and out-of-scope units; and 

4. Field work, such as area listings. 

C. Correcting for coverage error — examples: 

1 . Use a dual-frame approach for survey estimation; and 

2. Employ post-stratification procedures. 

2. MEASUREMENT ERRORS— DATA COLLECTION 

A. Sources of error: 

1 . Questionnaire design, content, wording and instructions; 

2. Length of reference period; 

3. Interview mode(s); 

4. Interviewers — characteristics, training, and supervision; 

5. Respondent rules — self versus proxy respondents; 

6. Use of records by respondents; 

7. Other respondent effects; 

8. Consistency and time-in-sample bias for longitudinal studies; 

9. Responses to related multiple measures within a questionnaire; 

10. Statistics derived for related measures from different questionnaires within a 
survey system; and 

1 1 . Responses to related measures from multiple respondents in a sampled unit 
(e.g., parent/student). 
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B. Evaluation of measurement errors — examples: 

1 . Pilot or field test survey and procedures; 

2. Cognitive research methods; 

3. Reinterview studies; 

4. Response variance; 

5. Randomized experiments; 

6. Behavior coding; 

7. Interviewer variance studies; 

8. Interviewer observation studies; 

9. Record check studies; and 

10. Comparisons of related measures within questionnaires, across respondents; 
and across questionnaires within a survey system. 

C. Correcting for measurement errors — examples: 

1. Use the results from a pilot or field test to modify questionnaire and/or 
procedures; 

2. Use input from cognitive research to modify questionnaire; 

3. Where possible, use results from comparisons of related measures; and 

4. Employ interviewer retraining and feedback. 

3. DATA PREPARATION ERROR 

A. Sources of error: 

1 . Pre-edit coding; 

2. Clerical review; 

3. Data entry; and 

4. Editing. 

B. Evaluation of processing errors — examples: 

1 . Pre-edit coding; 

2. Clerical review verification; 

3. Data entry verification; 

4. Editing verification for manual edits; 

5. Edit rates; 

6. Coder error variance estimates; and 

7. Rating and scoring error variance estimates. 

C. Correcting for data preparation errors — examples: 

1 . Resolution of differences identified in verification; 

2. Increased training; 
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3. Feedback during rating and coding; and 

4. Edits for lack of internal agreement, where appropriate. 

4. SAMPLING AND ESTIMATION ERRORS 

A. Sources of error: 

1. Weighting procedures; 

2. Imputation procedures; and 

3. Sample survey estimation and modeling procedures. 

B. Evaluation of sampling and estimation errors — examples: 

1. Variance estimation; 

2. Analysis of the choice of variance estimator; 

3. Indirect estimates for reporting sampling error — use of generalized variance 
functions, small area estimates, and regression models; 

4. Comparison of final design effects with estimated design effects used in 
survey planning; 

5. Analysis of the frequency of imputation and the initial and final distributions 
of variables; and 

6. Analysis of the effect of changes in data processing procedures on survey 
estimates. 

C. Correcting for estimation errors — examples: 

1. Re-estimation using alternative techniques (e.g., outlier treatments, 
imputation procedures, and variance estimation procedures); and 

2. Explore fitting survey distributions to known distributions from other 
sources to reduce variance and bias. 


5. NONRESPONSE ERRORS 

A. Sources of error: 

1. Household/school/institution nonresponse; 

2. Person nonresponse; and 

3. Item nonresponse. 

B. Evaluation of nonresponse errors — examples (see Standard 4-4): 

1. Comparisons of respondents to known population characteristics from 
external sources; 

2. Comparisons of respondents and nonrespondents across subgroups on 
available sample frame characteristics or, in the case of item nonresponse, 
on available survey data; 

3. Comparisons of characteristics of early and late responding cases; 
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4. Follow-up survey of nonrespondents for a reduced set of key variables to 
compare with data from respondents; and 

5. Descriptions of items not completed, patterns of partial nonresponse, and 
characteristics of sampling units failing to respond to certain groups of 
characteristics. 

C. Correcting for nonresponse errors — examples (see Standards 3-2, 4-1, and 4-4): 

1. If response rates are low during initial phases of data collection and funds 
are not available for intensive follow-up of all respondents, take a random 
subsample of nonrespondents and use a more intensive data collection 
method; 

2. Use nonresponse weight adjustments for unit nonresponse; and 

3. Use item imputations for item nonresponse. 

D. Methods for reducing nonresponse — examples (see Standards 3-2, 4-1, and 4- 

4): 

1. Employ pretest or embedded experiments to determine the efficacy of 
incentives to improve response rates; 

2. Use internal reporting systems to monitor nonresponse during collection; 

3. Use follow-up strategies for nonrespondents to encourage participation; 

4. Target a set of key data items for collection with unwilling respondents; and 

5. For ongoing surveys, consider separate research studies to examine 
alternative methods of improving response rates. 
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SUBJECT: NONRESPONSE BIAS ANALYSIS 
NCES STANDARD: 4-4 

PURPOSE: To identify the existence of potential bias due to unit and item 
nonresponse. 

K E Y TERMS: base weight, frame, item nonresponse, nonresponse bias, overall unit 
nonresponse, potential magnitude of nonresponse bias, required response items, 
response rate, stage of data collection, survey, total nonresponse, unit nonresponse, and 
wave. 


STANDARD 4-4-1: Any survey stage of data collection with a unit or item response 
rate less than 85 percent must be evaluated for the potential magnitude of nonresponse 
bias before the data or any analysis using the data may be released. (See Standard 1-3 
for how to calculate unit and item response rates.) Estimates of survey characteristics 
for nonrespondents and respondents are required to assess the potential nonresponse 
bias. The level of effort required is guided by the magnitude of the nonresponse. 


STANDARD 4-4-2: When unit nonresponse is high, nonresponse bias analysis must 
be conducted at the unit level to determine whether or not the data are missing at 
random and to assess the potential magnitude of unit nonresponse bias. At the unit 
level, the nonresponse bias analysis must be conducted using base weights for the 
survey stage with nonresponse. The following guidelines must be considered in such 
analysis. 

GUIDELINE 4-4-2A: Comparisons of respondents and nonrespondents across 
subgroups using available sample frame characteristics provide information about 
the presence of nonresponse bias. This approach is limited because observed frame 
characteristics are often unrelated or weakly related to more substantive items in the 
survey. 

GUIDELINE 4-4-2B: Formal multivariate modeling can be used to compare the 
proportional distribution of characteristics of respondents and nonrespondents to 
determine if nonresponse bias exists and, if so, to estimate the magnitude of the 
bias. These multivariate analyses are used to identify the characteristics of cases 
least likely to respond to an interview (such analyses are often referred to as 
nonresponse propensity models). Cases are coded as either responding to or not 
responding to the interviews and multivariate techniques are used to identify which 
case characteristics significantly relate to unit nonresponse. The predictor variables 
should have very high response rates. This approach may be limited by the extent 
to which such predictors exist in the data. 

GUIDELINE 4-4-2C: Comparisons of respondents to known population 
characteristics from external sources can provide infonnation about how the 
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respondents differ from a known population. This approach is limited by 
infonnation available from existing sources on the population of interest. Known 
population characteristics are often unrelated or weakly related to more substantive 
items in the survey. 

GUIDELINE 4-4-2D: For collections in which successive levels of effort (e.g., 
increasing number of contact attempts, increasing incentives to respond) are 
employed to reduce nonresponse, comparisons of characteristics can be made 
between the later/more difficult cases and the earlier/easier cases to estimate the 
characteristics of the remaining nonrespondents. This approach may be less 
effective if overall or total response rates are relatively low or if a collection period 
is relatively short in duration. In addition, the assumption that nonrespondents are 
like those respondents who are difficult to reach may not hold. 

GUIDELINE 4-4-2E: More intensive methods and/or incentives can be used to 
conduct a followup survey of nonrespondents on a reduced set of required response 
items. Comparisons between the nonrespondent follow-up survey and the original 
survey can be made to measure the potential magnitude of nonresponse bias in the 
original survey. This approach may be costly and less useful for modeling 
nonresponse bias if the nonrespondent follow-up survey response rates are also 
below 70 percent. 

GUIDELINE 4-4-2F : The estimated bias can be summarized using the following 
measures. One measure is the ratio of the bias to the standard error, using the base 
weight. A second measure is the ratio of the bias to the reported survey mean, 
using the base weight. If weighting adjustments are used to reduce bias, these 
measures should also be reported using the final weighted estimates. 


STANDARD 4-4-3: When item nonresponse is high, nonresponse bias analysis must 
be conducted at the item level to detennine whether or not the data are missing at 
random and to assess the potential magnitude of item nonresponse. To analyze potential 
bias from item nonresponse, the guidelines below must be considered. 

GUIDELINE 4-4-3A: For an item with a low total response rate, respondents and 
nonrespondents can be compared on sampling frame and/or questionnaire variables 
for which data on respondents and nonrespondents are available. Base weights 
must be used in such analysis. Comparison items should have very high response 
rates. This approach may be limited to the extent that items available for 
respondents and nonrespondents may not be related to the low response rate item 
being analyzed. 

GUIDELINE 4-4-3B: Formal multivariate modeling can be used to compare 
characteristics of respondents and nonrespondents to determine if nonresponse bias 
exists and, if so, to estimate the magnitude of the bias. These multivariate analyses 
are used to identify the characteristics of cases least likely to respond to an item 
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(such analyses are often referred to as nonresponse propensity models). Cases are 
coded as either responding to or not responding to the item and multivariate 
techniques are used to identify which case characteristics significantly relate to item 
nonresponse. Base weights must be used in such analysis. The predictor variables 
should have very high response rates. This approach may be limited by the extent to 
which such predictors exist in the data. 

GUIDELINE 4-4-3C: If the overall response rate is acceptable, nonresponse bias 
analysis may be conducted using data from survey respondents only. Unit-level 
respondents who answered the low response rate item can be compared to unit-level 
respondents who did not answer the item. Final weights and unimputed variables 
should be used in such an analysis. The comparison items should have very high 
item response rates. This approach may be limited because it does not directly 
analyze nonresponse bias that may originate because of unit-level nonresponse. 
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ANALYSIS OF DATA/PRODUCTION OF 
ESTIMATES OR PROJECTIONS 


5-1 Statistical Analysis, Inference, and Comparison 
5-2 Variance Estimation 
5-3 Rounding 

5-4 Tabular and Graphic Presentations 
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SUBJECT: STATISTICAL ANALYSIS, INFERENCE, AND COMPARISON 
NCES STANDARD: 5-1 

PURPOSE: To ensure that statistical analyses, comparisons, and inferences included in 
NCES products are based on appropriate statistical procedures. 

KEY TERMS: effect size, estimation, hypothesis testing, Minimum Substantively Significant 
Effect (MSSE), power, rejection region, simple comparison, statistical inference, survey, tail, 
Type I error, and Type II error. 


STANDARD 5-1-1: Statistical analyses must be approached from an analysis plan that 
considers relevance to policy, prior findings in existing literature, and/or results of 
previous survey research. The analysis plan must specify the main research questions, 
and justify the choice of statistical methodology. 


STANDARD 5-1-2: Analyses of sample survey data based on a stratified sample 
design must use appropriate case weights to correct for the unequal probabilities of 
selection. In the case of a stratified sample design with disproportionate sample 
allocation, the use of appropriate case weights will reduce the biases in means and 
totals, but will not necessarily correct biases in standard errors. 


STANDARD 5-1-3: The criterion for judging statistical significance in all reported 
hypothesis tests will be a = 0.05 (0.95 for confidence intervals). Reports will indicate 
an observed difference as statistically significant when an appropriate hypothesis test 
rejects the null hypothesis at a = 0.05. When estimates are compared to one another 
based on exploratory research and presented in descriptive reports, observed deviations 
in either direction are of interest and the rejection region lies within both tails of the 
distribution of the test statistic. The conclusions stated in the text are to be supported by 
two-tailed tests of significance (such as t tests or z tests). 

GUIDELINE 5-1-3A: If the survey purpose or prior research indicates that only 
differences between estimates in a specific direction are of interest or an established 
trend is to be updated with a new year of data, one-sided tests (in tests such as t tests 
or z tests) may be used to optimize power. In this case the region of rejection of the 
null hypothesis Ho, is contained in only one tail of the sampling distribution of the 
test statistic. 


STANDARD 5-1-4: Reported analyses must focus on differences that are substantively 
important (i.e., it is not necessary, or desirable, to discuss every statistically significant 
difference in a report). Statistical analysis techniques must be used that are appropriate 
for the specific research question. The rationale for the analytic approach must be 
described. The efficacy of individual statistical approaches depends on the assumptions 
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of the techniques having been met; therefore, the assumptions underlying the 
techniques must be discussed. 


GUIDELINE 5-1-4A: When conducting multiple comparisons, appropriate 
procedures should be considered to control the level of Type I error for 
simultaneous inferences. Multiple comparison procedures include, for example, 
Bonferroni, False Discovery Rate (FDR), Scheffe, and Tukey tests (see, for 
example, Hochberg and Tamhane 1987; Benjamini and Hochberg 1995). 

GUIDELINE 5-1-4B: Alternative presentation of the results, such as confidence 
intervals or coefficients of variation, should also be considered as appropriate. 

GUIDELINE 5-1-4C: When testing for structure in the data over time, a trend test 
or other suitable procedure should be perfonned (e.g., regression, ANOVA, or non- 
parametric statistics). In conducting over time analyses, possible changes in 
population composition should be considered. 

GUIDELINE 5-1-4D: When it is appropriate, the use of multiple regression and 
multivariate analysis techniques should be considered to examine relationships 
between a dependent variable (e.g., test score) and a set of independent variables 
(e.g., race, sex, and family background). Such techniques can provide an integrated 
approach to testing many simultaneous relationships. 


GUIDELINE 5-1-4E: In general, standardized regression coefficients should be 
used. When the units of measurement are meaningful (e.g., number of years of 
schooling), unstandardized regression coefficients or mean differences should be 
provided. 

GUIDELINE 5-1-4F: When the results of an analysis are statistically significant, it 
is useful to consider the substantive interpretation of the size of the effect. For this 
purpose, the observed difference can be converted into an effect size to allow the 
interpretation of the size of the difference. 

For a t test of the mean difference, for example, the estimated effect size is the 
observed difference between the two observed means relative to a measure of 
variability, such as the standard deviation. 

In correlation analysis, r is the effect size. Consult Cohen (1988) for measures of 
effect size using additional statistical procedures. 

Cohen’s (1988) convention for interpreting effect sizes may be used. Empirical 
evidence has shown that for t tests or z tests, an effect size of 0.2 is small, 0.5 is 
medium, and 0.8 is large. As for correlations, an r of 0.1 is small, 0.3 is medium, 
and 0.5 is large. 
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GUIDELINE 5-1-4G: Another approach to considering the substantive importance 
of a significant difference is to compare the size of the difference to the minimum 
substantively significant effect (MSSE) size that is detennined a priori. 

GUIDELINE 5-1-4H: When reporting on the significance of important findings, 
confirmatory and corroborative statistical methods and significance tests should be 
used. For example, if the original significant finding is based on a simple 
comparison t test, t tests adjusted for multiple comparisons could also be used if 
appropriate. Another example would be to confirm important findings obtained with 
one analytic approach with a second analysis conducted using an alternative 
approach. 


STANDARD 5-1-5: Failure to reject the null hypothesis does not imply acceptance of 

the null hypothesis. When the null hypothesis is not rejected, the following options are 

available: 

1 . Do not report on this test. 

2. Report that statistically significant differences or effects were not detected. 

3. If the significance is between .05 and .10, and the observed differences are believed 
to be real, based on research or other evidence, but are not significant at the .05 
level, possibly associated with small sample sizes and/or large standard errors, this 
may be noted. 

4. If the estimate is “unreliable,” the reader may be informed that the standard error is 
so high that the observed large differences are not statistically significant. 

5. If a statistically significant difference for a total group under study is observed, but 
similar subgroup differences of the same magnitude are associated with smaller 
sample sizes and/or larger standard errors and are not statistically significant, this 
may be noted. 

6. If there are large apparent differences that are not significant, possibly associated 
with small sample sizes and/or larger standard errors, this may be noted. 

7. Use a 95 percent confidence interval to describe the magnitude of the possible 
difference or effect. 
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SUBJECT: VARIANCE ESTIMATION 


NCES STANDARD: 5-2 

PURPOSE: Given that most NCES sample designs have one or more of the following 
three characteristics: unequal probabilities of selection, stratification, and clustering, it is 
important to ensure that appropriate techniques for the estimation of variance in sample 
surveys are identified, implemented and documented. 

K E Y TERMS: clustered samples, confidentiality, Data Analysis System (DAS), DEFT, 
design effect (DEFF), estimation, imputation, point estimate, raking, replication method, 
Simple Random Sampling (SRS), strata, survey, Taylor-series linearization, and variance. 


STANDARD 5-2-1: Variance estimates must be derived for all reported point estimates 
whether reported as a single, descriptive statistic (e.g., 6 percent of 1988 eighth-graders 
dropped out of school by 1990) or used in an analysis to infer or draw a conclusion (e.g., 
more 12 th -graders took advanced-level mathematics courses in 1998 than in 1982). 


STANDARD 5-2-2: Variance estimates must be calculated by a method appropriate to a 
survey’s sample design (e.g., unequal probabilities of selection, stratification, clustering, 
and the effects of nonresponse, post-stratification, and raking). These estimates must 
reflect the design effect resulting from the complex design. 

Approximate variance estimation methods that adjust for most of the impact of clustering 
and stratification include bootstrap, jackknife, Balanced-Repeated Replication (BRR), 
and Taylor-series linearization. Replication methods (bootstrap, jackknife, and BRR) can 
also adjust for the impact of nonresponse, post-stratification, and raking. When 
replication methods are used, the number of replicates should be large enough to enable 
stable variance estimation (e.g., > 30) and small enough (e.g., < 100) for efficient 
calculation. 

GUIDELINE 5-2-2A: The preferred way to derive appropriate variance estimates for 
totals, means, proportions and regression coefficients is to use a statistical package 
that does not assume simple random sampling (SRS). Such packages include 
SUDAAN, WesVar, DAS, or Stata, and use such techniques as Taylor-series 
linearization or one of the replication methods mentioned above. 

GUIDELINE 5-2-2B: Consideration should be given to incorporating an adjustment 
for imputations in variance estimation procedures. 

GUIDELINE 5-2-2C: In some cases, alternative approximation strategies can be 
used to produce variance estimates. For example, software for multilevel models can 
be used to produce estimates that take into account some aspects of complex survey 
design. Care must be taken to include any clustering of the sample as a level in the 
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model(s). In addition, any design variables and weights, such as those associated with 
strata or measures of size, should be taken into account. 


STANDARD 5-2-3: Data files must include all infonnation necessary for point 
estimation and variance estimation (e.g., probabilities of selection, weights, stratum and 
PSU codes), subject to confidentiality constraints (see Standard 7-1 on Machine Readable 
Data Products and Standard 4-2 on Maintaining Confidentiality). 
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SUBJECT: ROUNDING NUMBERS AND PERCENTAGES FOR REPORTING 
IN TEXT AND DISPLAYING IN SUMMARY TABLES AND FIGURES 

NCES STANDARD: 5-3 

PURPOSE: To ensure consistent practices for rounding and displaying numbers and 
percentages in text and tables/figures. 

KEY TERMS: precision, survey, and universe. 


STANDARD 5-3-1: Calculations performed to produce summary data, and 
computations perfonned to estimate standard errors, must be done on numbers and 
percentages that are carried out to at least four decimal places (i.e., not on proportions). 
The final rounded value must be obtained from the original figure available, not from a 
series of roundings (e.g., 7.1748 can be 7.175 or 7.17 or 7.2 or 7 but not 7.18). This 
situation typically arises when researchers round percentages from tables in tenths of a 
percent to full percents to be used in text. 


STANDARD 5-3-2: Sums of column or row counts in a table must be derived using 
unrounded numbers, with appropriate rounding of the total after its derivation. All 
tables that should logically sum to either 100 percent, or to a numeric total, must 
include a note that states: NOTE: Detail may not sum to totals because of rounding. 


STANDARD 5-3-3: Because of software limitations, for presentation purposes the 
following specific rules for rounding must be used: 

If the first digit to be dropped is less than 5, the last retained digit is not changed. 

6. 1273 is rounded to 6. 127 

If the first digit to be dropped is greater than or equal to 5, the last digit retained is 
increased by 1 . 

6.6888 is rounded to 6.69 
5.451 is rounded to 5.5 


STANDARD 5-3-4: In multiplying or dividing numbers using data from secondary 
sources, the resulting precision cannot be more precise than that of any of the 
component numbers. (For example, if 4.5 and 5.75 are rounded numbers, the product 
can be stated only as 26, with 4.5 having two significant digits and 5.75 having three.) 
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STANDARD 5-3-5: Before rounding numbers for publication, a decision must be 
made about the appropriate number of decimal places to be reported using the 
following rules: 

1. Percentages appearing in text must be rounded to whole numbers unless small 
differences require finer breakdowns. Summary tables must be rounded to no more 
than one decimal place. 

2. Percentages appearing in reference and methodological tables must be rounded to 
no more than two decimal places except in certain methodological tables where 
finer breakdowns may be necessary. 

3. Standard errors must be rounded to one decimal place more than the estimates for 
which they are computed. 

4. Universe data may be reported unrounded. Sample survey data must be rounded. 

5. A measured zero in a universe survey (i.e., none of something) must always appear 
in a table or a figure as 0. If rounding is used in a universe survey, numbers that 
round to zero must be represented in tables and figures by the symbol #. 

6. When dealing with small values in sample surveys, zero and numbers that round to 
zero must be represented in tables and figures by the symbol #. 

7. When it is logically impossible to have a response in a cell (i.e., not applicable) that 
must be denoted by the symbol f. 

GUIDELINE 5-3-5A: Numbers appearing in text and summary tables should 
adhere to the following conventions: 

1. Round four- and five-digit numbers to hundreds (e.g., 1,255 is rounded to 1,300; 
56,789 is rounded to 56,800); 

2. Round six-digit numbers to thousands (e.g., 156,789 is rounded to 157,000); 
and 

3. Round millions and larger numbers to no more than two decimal places (e.g., 
1,234,567 is rounded to 1.2 or 1.23 million; 1,912,345,678 is rounded to 1.9 or 
1.91 billion). 
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SUBJECT: TABULAR AND GRAPHIC PRESENTATIONS 


NCES STANDARD: 5-4 

PURPOSE: To ensure that tables and graphics displayed in NCES products 
communicate information accurately, clearly, and efficiently. This will allow the reader 
to easily and correctly interpret the presentation as a stand-alone display. 

KEY TERMS: point estimate, reference year, survey, and survey year. 


STANDARD 5-4-1: All tables must be produced in accordance with the “NCES 
Guidelines for Tabular Presentations” (appendix C). 


STANDARD 5-4-2: Graphics must highlight important points. 


STANDARD 5-4-3: All figures (graphs, maps, or charts) must be understandable 
without reference to the text. 

1 . Each figure must have a concise title that identifies the content of the figure and the 
reference period for the survey. 

2. Each figure must include all notes necessary to convey infonnation not immediately 
evident from the main graphic, such as notes that define acronyms, explain special 
terms, or define the underlying population included in the analysis. 

GUIDELINE 5-4-3A: Bar and pie charts should include point estimates for each 
category displayed. 


STANDARD 5-4-4: All figures must be consistent with best practices for graphical 

display. All figures must adhere to the following: 

1. Omit distracting detail. For example, avoid the use of three-dimensional effects 
when only two dimensions are displayed. 

2. Be easy to read. For example, all elements (font, lines, labels, symbols, segments, 
etc.) should be large enough to read with ease in the printed fonn, easily 
differentiated, and legible when photocopied or printed in black and white. 

3. Be consistent with and prepared in the same style as other figures in the same 
publication or product. For example, lettering should be of similar size and font, 
lines of the same weight, symbols or legends should be used for the same 
categories. 

4. Use consistent scales with consistent spacing when presenting similar units of 
measurement. 
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5. With the exception of time-series, continuous scales should start with zero or the 
minimum value of the scale. If used, scale breaks should be clearly visible; 

6. When using time-series data, time intervals should be plotted on a linear scale and 
actual data points should be labeled. 

7. Include labels for all variables and categories. 

8. Clearly label all axes and include tick marks on axes. 

9. Prepare figures with patterns, screens, or colors selected to print clearly across 
different media. In addition, all tables and figures must be in compliance with 
Section 508 standards that require that information on web pages be made 
"accessible" to people with a wide range of disabilities, including vision and 
hearing impairments, dexterity problems, color blindness, and even rare conditions 
such as photosensitive epilepsy triggered by rapidly flashing lights. For the full text 
of the law, see: 

www.cio.gov/Documents/section%5F508%5Faugust%5F1998%2Ehtml 


STANDARD 5-4-5: All figures must incorporate a complete source note. A complete 
source note identifies all the sources relevant to the data presented in the figure. 

GUIDELINE 5-4-5A: For figures based on data from one or more reports, the 
Source should cite the report, relevant survey(s) or subsurvey(s), data reference 
year, file version number, department name, and agency name. In the case of 
unpublished data, use the month and year of the tabulation or data file. If the data 
are drawn from multiple years: for 1 to 3 years, report each year; for more than 3 
continuous years, use the year span; and for more than 3 noncontinuous years, use 
“selected years” and the year span. (See appendix D for list of survey titles.) 

EXAMPLES: 

Data from one or more reports: 

Revenues and Expenditures for National Public Elementary and Secondary 
Education: School Year 1997-98, Common Core of Data (CCD), “National 
Public Education Financial Survey” (NPEFS), 1997-98, Version 1, U.S. 
Department of Education, National Center for Education Statistics. 

Data from unpublished tabulations and a published NCES report: 

SOURCE: U.S. Department of Commerce, Bureau of the Census, Current 
Population Survey, previously unpublished tabulations (April 1998); and U.S. 
Department of Education, National Center for Education Statistics, Dropout 
Rates in the United States. Selected years, 1972-97. 

GUIDELINE 5-4-5B: For figures based on data from a compendium report, the 
source note should cite the compendium report and the original survey or survey 
report (e.g., 1998 Digest of Education Statistics, Integrated Postsecondary 
Education Data System, Fall Enrollment 1997). 
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GUIDELINE 5-4-5C: For figures based on unpublished tabulations from surveys 
that are not the main focus of the report, the source note should indicate the data 
source followed by “previously unpublished tabulation.” 

GUIDELINE 5-4-5D: For figures based on online data tools, the source note 
should cite the data source and the data tool. 


STANDARD 5-4-6: Supporting data for figures must be included in the publication or 
product. In the case of reports that are extracts that summarize existing publications, 
supporting data are not required, but summary products must refer to the full report. In 
the case of short publications (i.e., 15 pages or less), if supporting data are not available 
in a published report, they must be available on the web and the publication must refer 
to the URL. (See web standards for URL format.) 


STANDARD 5-4-7: All tables that should logically sum to either 100 percent, or to a 
numeric total, must include a notes that states: NOTE: Detail may not sum to totals 
because of rounding. 


STANDARD 5-4-8: Figures in the executive summary must be assigned alpha 
characters consecutively and figures in reports must be assigned numbers. Figures in 
appendixes must be assigned the letter of the appendix and a number suffix (e.g., 
figures in appendix A must be labeled A-l, A-2, etc.). 


STANDARD 5-4-9: Data for the outlying areas must be excluded from U.S. summary 
totals, unless separate totals are shown. 


STANDARD 5-4-10: When presenting multiple related figures on one page, a 
summary title must appear at top of the page and each figure must have its own title. 
When using multiple related figures from one source on the same page, the source note 
must be provided at the bottom of the page. When using multiple related figures from 
different sources on the same page, source notes must be provided for each figure. 
These source notes must follow the guidelines in Standard 5-4-5. 


REFERENCES: 

Data Documentation Initiative, http://www.icpsr.umich.edu/DDL 

Harris, R.L. (1999). Information Graphics: A Comprehensive Illustrated Reference: 
Visual Tools for Analyzing, Managing, and Communicating. New York, NY: Oxford 
University Press. 
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Schmid, C.F., and Schmid, S.E. (1979). Handbook of Graphic Presentation. New York, 
NY: Wiley. 

Tufte, E.R. (1983). The Visual Display of Quantitative Information. Cheshire, CT: 
Graphics Press. 

Tufte, E.R. (1997). Visual Explanations: Images and Quantities, Evidence, and 
Narrative. Cheshire, CT: Graphics Press. 

U.S. Department of Education, Office of Educational Research and Improvement. 
(1999). OERI Publications Guide. Washington, DC: U.S. Government Printing Office. 
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ESTABLISHMENT OF REVIEW 
PROCEDURES 


6-1 Review of Reports and Data Products 
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SUBJECT: REVIEW OF REPORTS AND DATA PRODUCTS 


NCES STANDARD: 6-1 

PURPOSE: To ensure that NCES produces and releases high quality products suitable 
for a variety of audiences, NCES employs a multistage review process for all NCES 
products. In the case of descriptive, analytic, and technical reports, the review process 
includes internal and external peer review comments that are addressed through a 
formal review meeting, known as the adjudication meeting. 

KEY TERMS: key variables. 


STANDARD 6-1-1: Prior to the release of a new microdata file, a report presenting 
the key variables contained on the file must be adjudicated and made available to the 
public. Key variables include the major variables that were identified in the analysis 
plan, and those items that will be maintained over time as part of an NCES data series. 


STANDARD 6-1-2: All NCES products must be reviewed for technical details and 
overall quality. The level of review required for each type of product is identified in 
Table 6-1 -A. NCES uses six levels of review: 

Level 1. Review and Adjudication: Requires Program Director (PD), Senior 
Technical Advisor (STA), Associate Commissioner (AC), Office of the Deputy 
Commissioner (ODC), and Office of the Commissioner (OC) review and 
signoff, and outside reviewers are included on the review committee. 

Level la. Rolling Review: Requires PD/STA/AC/ODC review and approval as 
parts of the whole are completed. Final product requires full Level 1 review. 

Level 2. Statistical Review: Requires PD/AC/ODC review and approval, but 
no outside review or adjudication. The inclusion of an STA review is at the 
discretion of the AC. 

Level 3. AC/ODC/OC: Requires PD/AC/ODC/OC review and approval, but no 
outside review or adjudication. The inclusion of an STA review is at the 
discretion of the AC. 

Level 4. AC: Requires PD/AC review and approval, but no ODC/OC or outside 
review or adjudication. 

Level 5. NCES/RIMG/OMB: Requires PD/STA/AC approval within NCES, 
plus review/approval by Regulatory Information Management Group (RIMG) 
and Office of Management and Budget (OMB), and copy to Chief Statistician. 

Level 6. Author/Web publisher: Requires full review/adjudication as 
appropriate for the original NCES numbered product. 
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STANDARD 6-1-3: Reports requiring Level 1 Review and Adjudication must go 
through the review procedures outlined in list 6-1 -A and chart 6-1 -A. 


STANDARD 6-1-4: All NCES web products/applications require review as outlined in 
table 6-1-B. 


STANDARD 6-1-5: The NCES publication process and related timelines must be 
documented on the publication sign-off sheet (Form 6-1 -A). 
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Table 6-1-A. NCES products: Required reviews 



Type of review required 


Level 1. 

Level la. 

Level 2. 

Level 3. 

Level 4. 

Level 5. 

Product: 

Review and 
Adjudication 

Rolling 

Review 

Statistical 

Review 

AC/ODC/OC 

AC 

NCES/RIMG/ 

OMB 

Compendium 

X 






Directory 

X 






NCES Handbook 

X 






Updated indicator 


X 





Pre-release data 





X 


Statistical Analysis Report 

X 






R&D Report 

X 






Technical/Methodological Report 

X 






Statistics in Brief 

X 






E D. TABS 

X 






Issue Brief 

X 






Quarterly 



X 




Re-packaged Excerpt only 



X 




Guide (e.g., Programs & Plans) 



X 




Working Paper 





X 


Data File (including CD 
ROM/DAS/WEB) 



X 




Data File Documentation /User's 
manual (must accompany data file) 



X 




Video/Data 



X 




Conference Report 





X 


Non-data Videotape (e.g., 
conference, Commissioner's 
statements) 





X 


Brochure/Pamphlet 




X 



Newsletter 




X 



Co-op Product (e.g., FORUM, 
NPEC) 





X 


Questionnaire 






X 

Glossary 




X 




Level 1. Review and Adjudication 
Level la. Rolling Review 
Level 2. Statistical Review 
Level 3. AC/ODC/OC 
Level 4. AC 

Level 5. NCE S/RIMG/OMB 
Note: 


Requires PD/STA/ AC/ODC/OC review and signoff, and outside reviewers are 
included in the review committee 

Requires PD/STA/ AC/ODC review and approval as parts of the whole are 
completed. Final product requires full Level 1 review. 

Requires PD/AC*/ODC review and approval, but no outside review or 
adjudication. 

Requires PD/ AC*/ODC/OC review and approval, but no outside review or 
adjudication. 

Requires PD/ AC* review and approval, but no outside review or adjudication. 
No official NCES distribution but made available via web or special request. 
Requires PD/STA/ AC approval within NCES plus review/approval by 
RIMG & OMB, and copy to Chief Statistician. 

AC* review may or may not require STA review at the discretion of 
the AC. 
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Table 6-1-B. NCES web products: Required reviews 


Product: 


Type of review req 

luired 

Level 1. 

Review and 
Adjudication 

Level 2. 

Statistical 

Review 

Level 3. 
AC/ODC/OC 

Level 4. 

AC 

Level 6. 

Author/Web 

Publisher 


Web applications: 

NCES products: (with #) 


pdf file 

X 




X 

Html 

X 




X 

ASCII/ Excel/ data base file* 


X 



X 

Conference Report/Co-op Product 




X 

X 


Tools: 

Locator 

Peer Tool: Public Access 
Peer Tool: Limited Access* 

Data Tool 
Questionnaire Tool 

Glossary Search - based on approved product 
(with NCES #) 

Table/ Figure Search 
DAS 


Web sites; pages; information sources: 

Survey /Program site 
Web Publication 
Quick Facts 
Video 

Informational Video 
Data Video 

PowerPoint Presentation 
Quick tables/figures (quarterly) 

Unadjudicated Co-op Product 
Working Paper 

* Excludes pre-release data 

X All tools with micro data will be subjected to data snooping tests as well as 

appropriate review. A full adjudication review is required only for new 
products. Updates to current products only require review of the update 
information as appropriate. 

Level 1 . Review and Adjudication Requires PD/STA/AC/ODC/OC review and signoff, and outside reviewers are 

included in the review committee 

Requires PD/STA/AC/ODC review and approval as parts of the whole are 
completed. Final product requires full Level 1 review. 

Requires PD/AC*/ODC review and approval, but no outside review or 
adjudication. 

Requires PD/ AC*/ODC/OC review and approval, but no outside review or 
adjudication. 

Requires PD/AC* review and approval, but no outside review or adjudication. 
No official NCES distribution, but made available via web or special request. 
Level 6. Author/Web Publisher Assumes previous adjudication/review as appropriate for the original NCES 

numbered product. 

Note: AC* review may or may not require STA review at the discretion of the AC. 


Level la. Rolling Review 
Level 2. Statistical Review 
Level 3. AC/ODC/OC 
Level 4. AC 




X 


X 

X 









X 








X 


X 



X 


X 



X 


X 





X 




X 

X 




X 





X 


X 



X 


X 




X 

X 



X 





X 





X 


X 



X 




X 
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LIST 6-1-A: KEY STEPS IN THE REVIEW AND ADJUDICATION PROCESS 


NCES reports that include data or the analysis of data undergo both internal and 

external peer review. 

PROGRAM REVIEW PROCESS 

Decision: NCES Author submits draft report to Program Director for review. 

Sign-off: Program Director 

DIVISION REVIEW PROCESS 

Decision: NCES Author submits draft report to Senior Technical Advisor for review. 

The Senior Technical Advisor sends signed-off draft to the Associate 
Commissioner for clearance and to the Chief Statistician for a pre-review. 

Sign-off: Senior Technical Advisor, Associate Commissioner, and Chief Statistician 

APPROVAL OF PROPOSED REVIEWERS 

Decision: NCES Author submits reviewer memo through the Associate 
Commissioner to the Office of the Commissioner (OC) 3 weeks before the 
report due to OC date. The reviewers must include two relevant 
specialists from other NCES programs, and one or more external 
reviewers for additional subject matter or technical expertise. 

Sign-off: Associate Commissioner and Commissioner 

SUBMIT REPORT TO THE OFFICE OF THE COMMISSIONER 

Decision: NCES Author submits approved peer review list and the publication to the 
Office of the Commissioner for clearance for distribution for review. 

Sign-off: Commissioner 

REVIEW BY DIRECTOR 

Decision: Five (5) working days for review by the Institute of Education Sciences 
(IES). 

Sign-off: Director 

INTERIM REVISION PERIOD 

Decision: Ten (10) working days for author to make revisions requested from IES. 

Sign-off: Associate Commissioner, in consultation with the Chief Statistician and 
the Commissioner 

SCHEDULE ADJUDICATION MEETING 

Decision: NCES Author requests a Statistical Standards Program (SSP) chair for an 
adjudication meeting. After a chair is selected, an adjudication meeting is 
scheduled. 

Sign-off: Chief Statistician 
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LIST 6-1-A: KEY STEPS IN THE REVIEW AND ADJUDICATION PROCESS 
(continued) 


DISTRIBUTION FOR REVIEW 

Process: NCES Author sends peer review draft to internal and external reviewers. 

This draft should include supporting documentation for statistical testing. 
At the same time, the Office of the Commissioner sends the peer review 
draft for Principal Operating Component (POC) review allowing 2 days 
for distribution. All reviewers also receive notification of the time and 
place for the adjudication meeting, and an invitation to attend (see Letter 
6-1-A). 

Review period: Eighteen (18) or more working days for all reports. NCES Authors 
are to allow 15 days for peer review, with a request for written 
comments from reviewers no later than 3 days prior to the scheduled 
adjudication meeting. 

PREPARATION OF REVIEWERS COMMENTS 

Process : NCES Author delivers one copy of all POC and peer reviewer comments to 
the Chief Statistician and one copy to the adjudicator 2 working days 
before the scheduled adjudication meeting. To concentrate the 
adjudication meeting on areas needing resolution, when possible, a pre- 
adjudication memo should be provided at the adjudication with author 
agreement and suggested responses to comments. 

ADJUDICATION MEETING DECISION 

Decision: If, and only if, comments from all reviewers are received and are minimal, 
the author may recommend not holding the adjudication meeting. 

Sign-off: Chief Statistician 

ADJUDICATION MEETING 

Process: The Adjudicator chairs a meeting of the author and reviewers. The Author 
presents major points from the written comments of reviewers; these are 
discussed and resolved by the participants. The Adjudicator makes 
decisions if no consensus is reached. Prior to the end of the meeting, the 
author is responsible for summarizing the description of all revisions 
agreed upon during the meeting. The Author obtains assurance from the 
Adjudicator that the publication with proposed changes will meet NCES 
standards. Any appeals to decisions may be made to the Chief Statistician. 
In cases where the revisions result in new analysis and/or extensive 
rewriting, a second adjudication meeting may be held. 
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LIST 6-1-A: KEY STEPS IN THE REVIEW AND ADJUDICATION PROCESS 
(continued) 


POST-ADJUDICATION REVISIONS AND CLEARANCE 
Decision: Within 15 working days, the NCES author submits the revised publication, 
along with a post-adjudication memo that describes all changes, to the 
adjudicator for review. 

Sign-off: Chief Statistician, based on recommendation of the Adjudicator. 

NOTE: The Commissioner of NCES is the final judge of the content of NCES 
publications. If the Commissioner delegates this authority, decisions may be 
appealed to the Commissioner. 
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CHART 6-1-A: NCES PUBLICATION REVIEW PROCESS 


Program 

Review 
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Form 6-1 -A 

National Center for Education Statistics 
Publication Review Form 


Pub #: 

Title: 

NCES Author: - 
Phone: 


Division: - 
No. Pages: 


Program: 


Room: 


Adjudicated Publications 

User’s Manual/Data File 
Issue Brief 
Statistics in Brief 
Statistical Analysis 
Technical 
R&D 

Compendium 

Guide 

Handbook/Directory 
E.D. TABS 


Other Publications/Products 

[ ] Non-data video 

[ ] Brochure 

[ ] Pamphlet 

[ ] Newsletter 

[ ] Glossary 

[ ] Cooperative product 

[ ] Conference Report 

[ ] Working Paper 

i ] Questionnaire 

I ] Compilation 


1 1 


1 1 


REVIEWERS 

DATE 

In Out 

DATE 

In Out 

DATE 

In Out 

INITIALS 

NCES Staff Submits Pub 








Program Director Review 








Review Memo to Program Dir. 








Review Memo to Assoc. Com. 








Review Memo to Commissioner 








Senior Technical Advisor Review 








Associate Commissioner Review 








Chief Statistician (CS) Pre-review * 








Pub to Pub. Coordinator 








Assistant Secretary Review 








Author Submits Revised Pub to 
AC/Chief Stat./Commissioner 








Schedule Adjudication with CS 








Copies to Pub Coordinator 








Copies to Peer Reviewers 








Reviewer Comments/Memo to Adj. 








Adjudication Meeting 








Post-Adjudication Pub, Memo, Web 
form, Abstract to Adjudicator 








Post-Adjudication Clearance CS 








Camera Copy to Pub Coordinator 








Announcement to Pub Coordinator 








Publications Section Review/GPO 








Members Web Form 








PDF to Webmaster 








Pub to GPO 









No occur concurrent with the Associate Commissioner’s review. 
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Letter 6-1 -A 


(J.S DEPARTMENT OF EDUCATION 

INSTITUTE OP EDUCATION SCIENCES 

national center '■oh s-djcatidn statistics 


<Month xx, 20xx> 


City, State Zip Code 
Dear Reviewer’s name: 

Thank you for agreeing to serve as a reviewer for the publication "Report Title” prepared 
by Contractor’s company name> for the National Center for Education Statistics. This is the 
<identify report history or type>. This report provides <describe report contents>. The purpose 
of the study was to <describe the purpose>. 

I have enclosed a copy of the adjudication draft of the report for your review. The peer 
review process is an important part of maintaining the high standards of NCES publications. Your 
contribution to this process is greatly appreciated. Because this publication is still in review, no 
information from this report may be made public prior to its official release. 

The adjudication meeting is scheduled for <hour of the day> on <day of week, month, 
and date> in room <number> of our building at 1990 K Street, NW, Washington DC. It would be 
helpful if I could receive your comments by <day of week, month, and date> at the latest. 
Comments can be sent to me either by mail, e-mail, or fax. All reviewers are invited to attend the 
meeting; however, all comments received will be discussed, whether or not a reviewer is able to 
attend the meeting in person. If you cannot attend, I will send a summary response to your 
comments, as addressed at the meeting, upon your request. 

Thank you, again, for your time and effort in reviewing this report. If you need to reach 
me, my phone number is 202-502 -xxxx, fax number is 202-502-xxxx, and e-mail address is 
firstname.lastname@ed.gov. 



Reviewer 
Position 
Organization 
Street Address 


Sincerely, 


Staffer’s name 
Position 

National Center for Education Statistics 
1990 K Street, NW, Suite xxxx 
Washington, DC 20006-xxxx 
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DISSEMINATION OF DATA 


7-1 Machine Readable Products 

7-2 Survey Documentation in Reports 

7-3 Release and Dissemination of Reports and Data 
Products 
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SUBJECT: MACHINE READABLE PRODUCTS 


NCES STANDARD: 7-1 

PURPOSE: To ensure the utility of data files created by NCES staff and contractors, all 
NCES data files must be accompanied by easily accessible documentation that clearly 
describes the metadata necessary for users to access and manipulate the data. 

KEY TERMS: confidentiality, confidentiality edit, edits, imputation, metadata, 
reference year, response rates, survey, survey system, survey year, universe, and 
variance. 


STANDARD 7-1-1: Machine readable products must be released in ASCII format. 
Machine readable products include flat files, relational databases, and spreadsheets. Each 
record must contain a unique case identifier such as ID. Files with multiple records per 
case must also contain unique record type identifiers (e.g., record number, year of data). 
Data files must be in one of two acceptable formats: 

1 . Delimited, text quoted file format that is importable; or 

2. Positional files where the locations of all variables are identified (i.e., file, record 
within file, and position within record). 

GUIDELINE 7-1-1A: Data producers are invited to provide additional data sets in 
alternate formats that may be helpful to users. For guidance on web-based formats, 
see the NCES public web publishing standards; request a copy by sending an e-mail 
to NCESwebmaster@ed.gov . 

GUIDELINE 7-1-1B: To facilitate the sharing and use of data elements, national 
and international standards organizations have produced drafts of several standards 
for the creation of metadata on data elements. Examples are the International 
Organization for Standards “Specification and Standardization of Data Elements” 
standard (ISO/IEC 11179) and the more detailed American National Standards 
Institute “Metadata for the Management of Shareable Data” Standard (ANSI X3.285) 
( www.ansi.org) . These standards continue to be refined. Data producers should 
determine what metadata standards are current at the time data files are prepared and 
produce associated metadata for their files that are in compliance with applicable 
standards. 


STANDARD 7-1-2: A file description and record layout must be provided for each file. 
The file information/metadata header must include the following: 

1 . Title of the survey (survey name, part, and year as applicable); 

2. Name(s) of each file; 

3 . Reference year for the data; 
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4. Version number and date of release; 

5. Logical record length (in positional files) or number of variables on the file 
(delimited files); 

6. Number of records per case or observation; and 

7. Number of cases in the data file. For delimited files also include the delimiters 
(e.g., comma, space). 


STANDARD 7-1-3: For each variable on the file, the file description must include the 
following: 

1. Variable name; 

2. Data type (alpha or numeric); 

3 . Record number (if multiple records per case); 

4. Position within the record (beginning — end, or variable number if delimited) 
within the record, field length, and variable label; and 

5. The survey question wording and response categories. 

STANDARD 7-1-4: Data set naming conventions must be standardized and must 
conform to Information Systems Security Organization (ISSO) (or more recent) standards 
for pressing a CD, which currently requires a name with the following format: 
“xxxxxxxx.xxx”. 


STANDARD 7-1-5: Jewel box covers and web li nk s or URL links must identify the 
survey system (e.g., HS&B, CCD), component, survey year, and version number. 


STANDARD 7-1-6: All variables must be clearly identified and described. 

1 . The description of variables must include the universe for the variable. 

2. In the case of composite variables, the description must identify all survey items 
used to construct the variables and must include the algorithm used to construct 
the variables. 

3. Upper and lower case labels that clearly describe the variables must be used. 

4. For all categorical variables, each value must be associated with a frequency, a 
percentage of total cases, and a label for each category. In public-use and 
restricted-use file documentation, unweighted frequencies must be included (see 
Standard 4-2-10 for public-use files without confidentiality edits). 

5. For all continuous variables, the distribution of values (e.g., minimum, maximum, 
mean, and standard deviation) must be provided. 
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GUIDELINE 7-1-6A: FIPS Standards should be used where applicable. NCES 
standard definitions and codes should be used where applicable (see Standard 1-4). 

GUIDELINE 7-1-6B: Variables names should be consistent across surveys within a 
survey system, within and across years. 

GUIDELINE 7-1-6C: In a printable record layout file, line length should be 
specified so that it prints correctly without wrapping and without special modification 
(e.g., 72 characters, 12 point type). 


STANDARD 7-1-7: Data file documentation must be complete for all data files. This 
includes an abstract or summary that cites the methodology report or technical notes 
associated with the survey and a description of survey methodology that is consistent 
with the NCES standard for survey system documentation (see Standard 3-4). In general, 
survey methodology documentation for data files must include the following: 

1 . Description of data collection methods; 

2. Weighting and imputation procedures; 

3. Description of editing, error resolution, and imputation flags; 

4. Guidelines for processing the data; 

5 . The reference year for the data; 

6. Unweighted frequency counts, and response rates; 

7. Information on how to use replicate weights or PSUs and stratum for 
variance estimation; and 

8. Procedures for using weights to produce estimates. 


STANDARD 7-1-8: The following data element conventions must be used: 

1. Numeric fields must contain only numbers or blanks. Reserve codes for numeric 
fields should be extreme negative values (e.g., lower than the lowest real value). 

2. “0” must represent zeros. Blanks or “ — ” may not be used to represent Os. 

3. Unique values must be used to distinguish between legitimate skips and 
nonresponse. 

4. Suppression symbols must be removed from numeric fields and stored in 
associated "flag" fields. 

5. Separate record locations must be used for all data items. 

6. Imputed data must be flagged in associated “flag” fields. Imputation methods 
must be identified in the flag. Blanks are not legitimate values for flags. 

GUIDELINE 7-1-8A: When practical, numeric data fields containing continuous 

variables should be identical in length. 
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SUBJECT: SURVEY DOCUMENTATION IN REPORTS 


NCES STANDARD: 7-2 

PURPOSE: To provide the appropriate amount of documentation on the data, 
methodology, and other important aspects of a survey in each NCES report. Survey 
documentation in the report should enable the reader — even the nonstatistical user — to 
understand its contents, and the use and limitations of data, readily and clearly. 

KEY TERMS: coverage, disclosure risk analysis, frame, instruments, key variables, 
pretest, probability of selection, and survey. 


STANDARD 7-2-1: All NCES reports must include documentation that allows the 
reader to understand the nature and limitations of the results presented. The level of detail 
included will vary depending on the type of report. The general areas to be covered 
include executive summary, status of data, sample design, data collection, and data 
presentation. List 7-2-A outlines the types of documentation to be included in the various 
types of NCES reports. “C” for "Complete" indicates the full item is to be included. “B” 
for "Brief' indicates that a brief description should be included; and “f” indicates not 
applicable. 


STANDARD 7-2-2: Sampling standard errors must be available for all estimates 
included in reports. Sampling standard errors (se’s) or confidence intervals (Cl’s) for 
statistics in tables and graphs can be included in reports in their entirety. In which case, 
se’s or Cl’s for each table or graph are reported either in a separate table in an appendix, 
or in columns accompanying the statistics being presented. Alternatively, especially for 
publications that are targeted to general audiences, a separate table of exemplar standard 
errors on key statistics may be presented in the technical appendix with the detailed 
standard error tables for all tables and graphs included in a report available on the web. 

GUIDELINE 7-2-2A: To caution users who might attempt to independently test for 
certain differences using the standard errors provided, a cautionary note should be 
provided with the standard errors, stating the following: 

Some estimates may be correlated with each other. Generating statistical 
tests for such estimates solely with these standard errors implicitly assumes 
these covariances are zero and may be different from the actual significance 
test used in the report. 
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List 7-2-A. Checklist for 
documentation to be included in 
NCES reports 

Issue Brief 

Compendia 

ED. 

TAB 

Statistics 
in Brief 

R&D 

Report, 

Statistical 

Analysis 

Report 

Survey 

Technical 

Report 

EXECUTIVE SUMMARY 1 

History and purpose of the survey 

t 

B 

B 

t 

B 

B 

Target population 

t 

B 

B 

t 

B 

B 

Time and geographic coverage of the survey 

t 

B 

B 

t 

B 

B 

Main findings 

STATUS OF DATA 

t 

B 

B 

t 

B 

t 

t 

Identification of data as preliminary, revised. 

t 

C 

C 

C 

C 

t 

or final 

Schedule of revisions 

t 

t 

C 

C 

C 

t 

Relationship of survey to previous surveys in 

t 

B 

C 

c 

C 

C 

same series 

SAMPLE DESIGN 

Target population 

B 

B 

B 

B 

C 

C 

Size of target population 

B 

t 

B 

B 

C 

c 

Survey frame, including source of frame, 

t 

t 

B 

B 

B 

c 

reference date, and number of units 

Units selected for sample at each stage 

t 

t 

B 

t 

B 

c 

Number of sampling units at each stage 

t 

t 

B 

t 

B 

c 

Sample allocation procedure at each stage 

t 

t 

B 

t 

B 

c 

Sample selection process at each stage 

t 

t 

B 

t 

B 

c 

Total sample sizes 2 

B 

t 

B 

B 

C 

c 

Response rates and their derivations 

t 

t 

B 

B 

B 

c 

Measures of size defined for sampling with 

t 

t 

t 

B 

B 

c 

probability proportional to size 

Summary of sources of bias 

B 

t 

B 

B 

B 

c 

DATA COLLECTION 

Nature of instruments used (e.g., the contents 

t 

t 

B 

B 

B 

c 

or kinds of data sought in major sections of 
the instrument(s) and number of questions in 
each major section) 

Method(s) of administering the instrument(s) 

t 

B 

B 

B 

B 

c 

Copies of interview scripts/forms/ 

t 

t 

B 

B 

B 

c 

questionnaire, or copies upon request 

Quality control procedures used in data 

t 

t 

t 

t 

t 

c 

process and results of their implementation 

Results of pretest and independent evaluations 

t 

t 

t 

t 

t 

c 

Problems, if encountered 

t 

B 

B 

t 

B 

c 

Type of disclosure limitations used 

t 

t 

t 

t 

B 

c 

DATA PRESENTATION 

Definitions of key variables, critical concepts 

B 

B 

B 

C 

C 

c 

and constructed variables 

Supporting numbers for graphs 

C 3 

C 

C 

C 

C 

c 

Selected exemplar standard errors for tables 

t 

B 

B 

B 

B 

t 

and graphs 

NOTE: The above list outlines the types of documentation to be included in the various types of NCES 

reports. “C” for “Complete” indicates the item is to be included, 
description should be included, “f” means not applicable. 

“B” for ‘ 

‘Brief’ indicates that a brief 


1 Required if report is longer than 15 pages. 

2 Can be rounded to nearest 100 for restricted data files. 

3 Numbers not included in graphics in the report must be cited to an existing report. 
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SUBJECT: RELEASE AND DISSEMINATION OF REPORTS AND DATA 
PRODUCTS 


NCES STANDARD: 7-3 

PURPOSE: To ensure that all NCES products are disseminated in ways which help to 
promote the widespread use of NCES data, and to increase the awareness of NCES data 
among potential users. 

KEY TERMS: DAS 


STANDARD 7-3-1: All NCES products must be disseminated according to a plan that 
identifies intended and potential users. 

GUIDELINE 7-3-1 A: To ensure that the contents of a product reflect the needs of 
intended users, authors should consider user needs early in the publication development 
process. 

GUIDELINE 7-3-1B: In designing a publication or product, the author should 
consider the web presentation of the final product. 

GUIDELINE 7-3-1C: Once a product has been approved for release by the Chief 
Statistician, an author should arrange a meeting with OC to review proposed 
dissemination strategies including press releases, targeted mailings, the number of 
copies to be printed, web release, the use of print on demand, and the use of both print 
and electronic announcements. 

GUIDELINE 7-3-1D: Innovative ways to disseminate NCES data should be explored. 
Presentations at annual meetings, seminars on specific publications, training on the use 
of data bases, outreach to external groups, and special research efforts using NCES data 
should be encouraged. 

GUIDELINE 7-3-1E: NCES should have strategies in place to collect user feedback 
on the utility of its products and solicit recommendations for making NCES data more 
useful. 


STANDARD 7-3-2: NCES products should utilize a variety of dissemination techniques, 
as outlined in Table 7-3-A. All publications must be produced in PDF format, and all 
mandatory publications must also be produced in HTML format. 

GUIDELINE 7-3-2A: Efforts should be made to produce other publications in HTML 
format as well. 
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STANDARD 7-3-3: Staff responsible for NCES products requiring minor revisions must 
prepare an errata sheet for a level 2 statistical review (see Standard 6-1). Staff responsible 
for NCES products requiring major revision must prepare a revised report for a level 2 
statistical review. Reissued revised reports must carry the original NCES number followed 
by “rev.” When minor revisions approved for an errata sheet are incorporated in a web 
release, the NCES number must be followed by “rev.” 
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Table 7-3-A. NCES produ cts: Required product formats 

Type of product 


Product: 

Standard products: 

Compendium 
Directory 
NCES Handbook 

Updated indicator 
Pre-release data 

Statistical Analysis Report 
R&D Report 

Technical/Methodological 

Report 

E D. TABS 
Issue Brief 

Quarterly 

Re-packaged Excerpt only 
Guide (e.g., Programs & Plans) 
Working Paper 

Data File (including CD 
ROM/D AS/WEB) 


Data file Documentation/User's 
Manual (must accompany data 
file) 

Video/Data 

Conference Report 
Non-data Videotape (e.g., 
conference. Commissioner’s 
statements) 

Brochure/Pamphlet 
Newsletter 
Co-op Product (e.g., 

FORUM, NPEC) 

Questionnaire 
Glossary 

XX Must be produced for this format 
X Consider producing in this format 

1 Required for all Priority 1 publications, optional others 

2 Suggested for Universe Files-any format 


Print Web product Web tool 
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GLOSSARY 
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-A- 

An accommodation is a change in how a test is presented, in how a test is administered, or 
in how the test taker is allowed to respond. This term generally refers to changes that do 
not substantially alter what the test measures. The proper use of accommodations does not 
substantially change academic level or perfonnance criteria. Appropriate accommodations 
are made to provide equal opportunity to demonstrate knowledge. 

An African American or Black person has origins in any of the black racial groups of 
Africa. Terms such as "Haitian" or "Negro" can be used in addition to "Black or African 
American." 

An American Indian or Alaska Native person has origins in any of the original peoples 
of North and South America (including Central America), and who maintains tribal 
affiliation or community attachment. 

An Asian person has origins in any of the original peoples of the Far East, Southeast Asia, 
or the Indian subcontinent, including, for example, Cambodia, China, India, Japan, Korea, 
Malaysia, Pakistan, the Philippine Islands, Thailand, and Vietnam. 

An assessment is any systematic procedure for obtaining information from tests and other 
sources that can be used to draw inferences about characteristics of people, objects, or 
programs. 

An award incentive plan links all or some of the contract deliverables to perfonnance 
incentive payments beyond the fixed fee of the contract. There are minimum performance- 
based requirements that must be specified in order for a contract to be considered as an 
Award Incentive perfonnance-based contract. 

-B- 

The base weight is the inverse of the probability of selection. 

A bridge study continues an existing methodology concurrent with a new methodology for 
the purpose of defining the relationship between the new and old estimates. 

A Black or African American person has origins in any of the black racial groups of 
Africa. Terms such as "Haitian" or "Negro" can be used in addition to "Black or African 
American." 

-c- 

The capture/recapture technique uses two independent frames to estimate the number of 
units missed on both frames. The first step is to match frames to provide counts of units on 
one frame but not the other, as well as a count of units on both frames. With this 
infonnation and several basic assumptions, it is possible to estimate the number of units 
missed on both frames. In practice, the two frames may not be completely independent; in 
which case, a number of assumptions will be necessary to proceed with this type of 
estimation. 

Classical test theory postulates that a test score can be decomposed into two parts — a true 
score and an error component; that the error component is random with a mean of zero and 
is uncorrelated with true scores; and that observed scores are linearly related to true scores 
and error components. 

Clustered samples are those in which a naturally occurring group is first selected, such as a 
school or a residential block, and then units are sampled within the selected groups. 
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Coarsening disclosure limitation techniques preserve the individual respondent’s data by 
reducing the level of detail used to report some variables. Examples of this technique 
include recoding continuous variables into intervals; recoding categorical data into broader 
intervals; and top or bottom coding the ends of continuous distributions. 

Confidentiality involves the protection of individually identifiable data from unauthorized 
disclosures. 

Confidentiality edits are defined as edits that are applied to microdata for the purpose of 
protecting data that will be released in tabular form. Confidentiality edits are implemented 
using perturbation techniques. These techniques are used to alter the responses in the 
microdata file before tabulations are produced. Thus, all tables are protected in a consistent 
way. Because the perturbation techniques that are used are designed to preserve the level of 
detail in the microdata file, confidentiality edits maximize the infonnation that can be 
provided in tables, without requiring cell suppression or controlled rounding. 

A consistent data series maintains comparability over time by keeping an item fixed, or by 
incorporating appropriate adjustment methods in the event an item is changed. 

To be recognized as a Consolidated Metropolitan Statistical Area (CMSA), an area must 
meet the requirements for recognition as an MSA, have a total population of one million or 
more, and have (1) separate component areas that can be identified within the entire area by 
meeting specified statistical criteria, and (2) local opinion that indicates support for the 
component areas. 

Coverage refers to the extent to which all elements on a frame list are members of the 
population, and to which every element in a population appears on the frame list once and 
only once. 

Coverage error refers to the discrepancy between statistics calculated on the frame 
population and the same statistics calculated on the target population. Undercoverage 
errors occur when target population units are missed during frame construction, and 
overcoverage errors occur when units are duplicated or enumerated in error. 

A crosswalk study delineates how categories from one classification system are related to 
categories in a second classification system. 

A cross-sectional sample survey is based on a representative sample of respondents drawn 
from a population at one point in time. 

Cross-sectional imputations are based on data from a single time period. 

Cross-wave imputations are imputations based on data from multiple time periods. For 
example, a cross-sectional imputation for a time 2 salary could simply be a donor's time 2 
salary. Alternatively, a cross-wave imputation could be the change in a donor's salary from 
time 1 to time 2 multiplied by the time 1 nonrespondent' s salary. 

A cut score is a specified point on a score scale such that scores at or above that point are 
interpreted or acted upon differently from scores below that point. 

-D- 

A Data Analysis System (DAS) is an analysis software system that generates tabular 
estimates and correlation coefficients in a framework that allows external users to analyze 
individually identifiable data without allowing the user direct access to individual data 
records. Users are denied access to individual data records because the data are not in a 
directly readable fonnat. Additional safeguards come through the use of population 
subsampling and differential weighting from the sample design, as well as confidentiality 
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edits. The degree of editing required is a direct function of the capabilities of the DAS. As 
an example, a DAS that provides weighted totals (i.e., a direct measure of population size) 
within cells would require more confidentiality editing than one that does not provide 
weighted cell totals, because there is a greater risk of disclosure in groups with small 
population size. 

Data swapping is a perturbation disclosure limitation technique that results in a 
confidentiality edit. A simplistic example of data swapping would be to assume a data file 
has two potential individual identifying variables, for example, sex and age. If a sample 
case needs disclosure protection, it is paired with another sampled case so that each 
element of the pair has the same age, but different sexes. The data on these two records are 
then swapped. After the swapping, anyone thinking they have identified either one of the 
paired cases gets the data of the other case, so they have not made an accurate match and 
the data have been protected. 

DEFT is the square root of a design effect. 

A derived score is a raw score converted by numerical transformation into a new score 
providing a more meaningful and/or different measure (e.g., conversion of raw scores to 
percentile ranks, standard scores, or grade equivalence). 

The design effect (DEFF) is the ratio of the true variance of a statistic (taking the complex 
sample design into account) to the variance of the statistic for a simple random sample with 
the same number of cases. Design effects differ for different subgroups and different 
statistics; no single design effect is universally applicable to any given survey or analysis. 
Differential Item Functioning (DIF) exists when examinees of equal ability differ on an 
item solely because of their membership in a particular group. 

Disability is a physical or mental impairment that substantially limits one or more of the 
major life activities (42U.S.C. 12102). 

Disclosure risk analysis is used to detennine which records require masking to produce a 
public-use data file from a restricted-use data file. 

Domain refers to a defined universe of knowledge, skills, abilities, attitudes, interests, or 
other human characteristics. 

Dual-frame estimation uses a dual-frame design to combine two frames in the same 
survey to offer coverage rates that may exceed those of any single frame. Sometimes the 
best available list is known to have poor coverage and there are no known supplemental 
frames to provide sufficient coverage. For example, an area frame could be used as the 
second frame. 

-E- 

Editing is a procedure that uses available infonnation and some assumptions to derive 
substitute values for inconsistent values in a data file. 

Effect size refers to the standardized magnitude of the effect or the departure from the null 
hypothesis. For example, the effect size may be the amount of change over time, or the 
difference between two population means, divided by the appropriate population standard 
deviation. Multiple measures of effect size can be used (e.g., standardized differences 
between means, correlations, and proportions). 

The effective sample size, as used in the design phase, is the sample size under a simple 
random sample design that is equivalent to the actual sample under the complex sample 
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design. In the case of complex sample designs, the actual sample size is determined by 
multiplying the effective sample size by the anticipated design effect. 

Equating of two tests is established when examinees of every ability level and from every 
population group can be indifferent about which of two tests they take. Not only should 
they have the same expected mean score on each test, but they should also have the same 
errors of measurement. 

Estimation is the process of using sample data to provide a single best value for a 
parameter (such as a mean, proportion, correlation, or effect size), or to provide a range of 
values in the form of a confidence interval. 

-F- 

Fairness of a test is attained when construct- irrelevant personal characteristics such as 
race, ethnicity, sex, or disability have no appreciable effect on test results or their 
interpretation. 

In a field test, all or some of the survey procedures are tested on a small scale that mirrors 
the planned full-scale implementation. 

A frame is a mapping of the universe elements (i.e., sampling units) onto a finite list (e.g., 
the population of schools on the day of the survey). 

The frame population is the set of elements that can be enumerated prior to the selection 
of a survey sample. 

A freshened sample includes new cases added to a longitudinal sample plus the retained 
cases from the longitudinal sample used to produce cross-sectional estimates of the 
population at the time of a subsequent wave of a longitudinal data collection. 

-H- 

The half-open interval technique is used to increase coverage. In this technique, new in- 
scope units between a unit A on the previous frame up to, but not including, unit B (the 
next unit on the previous frame) are associated with unit A. These new units have the same 
selection probability as unit A's. This process is repeated for every unit on the frame. The 
new units associated with the actual sample cases are now included in the sample with their 
respective selection probabilities. For example, in the case of freshening the sample, this 
technique may be applied to a new list that includes cases that were covered in a previous 
frame, as well as new in-scope units not included in the previous frame. 

A Hispanic or Latino person is of Cuban, Mexican, Puerto Rican, South or Central 
American, or other Spanish culture or origin, regardless of race. The term "Spanish origin" 
can be used in addition to "Hispanic or Latino." 

Hypothesis testing draws a conclusion about the tenability of a stated value for a 
parameter. For example, sample data may be used to test whether an estimated value of a 
parameter (such as the difference between two population means) is sufficiently different 
from zero that the null hypothesis, designated Ho (no difference in the population means), 
can be rejected in favor of the alternative hypothesis, Hi (a difference between the two 
population means). 

-I- 

Imputation is a procedure that uses available information and some assumptions to derive 
substitute values for missing values in a data file. 
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An Individualized Education Plan (IEP) refers to a written statement for each individual 
with a disability that is developed, reviewed, and revised in accordance with Title 42 
U.S.C. Section 1414(d). 

Individually identifiable data refers specifically to data from any list, record, response 
form, completed survey, or aggregation about an individual(s) from which infonnation 
about particular individuals may be revealed by either direct or indirect means. 

Instrument refers to an evaluative device that includes tests, scales, and inventories to 
measure a domain using standardized procedures. 

Item nonresponse occurs when a respondent fails to respond to one or more relevant 
item(s) on a survey. 

Item Response Theory (IRT) postulates that the probability of correct responses to a set 
of test questions is a function of true proficiency and of one or more parameters specific to 
each test question. 

-K- 

Key variables include survey-specific items for which aggregate estimates are commonly 
published by NCES. They include, but are not restricted to, variables most commonly used 
in table row stubs. Key variables also include important analytic composites and other 
policy-relevant variables that are essential elements of the data collection. They are first 
defined in the initial planning stage of a survey, but may be added to as the survey and 
resulting analyses develop. For example, the National Assessment of Educational Progress 
(NAEP) consistently uses gender, race-ethnicity, urbanicity, region, and school type 
(public/private) as key reporting variables. 

-L- 

A Latino or Hispanic person is of Cuban, Mexican, Puerto Rican, South or Central 
American, or other Spanish culture or origin, regardless of race. The term "Spanish origin" 
can be used in addition to "Hispanic or Latino." 

Linkage results from placing two or more tests on the same scale, so that scores can be 
used interchangeably. 

A longitudinal sample survey follows the experiences and outcomes over time of a 
representative sample of respondents (i.e., a cohort) who are defined based on a shared 
experience (e.g., shared birth year or grade in school). 

-M- 

Metadata contain infonnation about the microdata. 

Metropolitan Statistical Areas (MS As) are those areas that (1) include a city of at least 
50,000 population, or (2) include a Census Bureau-defined urbanized area (of at least 
50,000 population) with a total metropolitan population of at least 100,000 (75,000 in New 
England). In addition to the county(ies) containing the main city or urbanized area, an 
MSA may include additional counties that have strong economic and social ties to the 
central county(ies) and meet specified requirements of metropolitan character. The ties are 
determined chiefly by census data on commuting to work. A metropolitan statistical area 
may contain more than one city with a population of 50,000 and may cross state lines. 

The minimum substantively significant effect (MSSE) is the smallest effect, that is, the 
smallest departure from the null hypothesis, considered to be important for the analysis of 
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key variables. The minimum substantively significant effect is determined during the 
design phase. For example, the planning document should provide the minimum change in 
key variables or perhaps, the minimum correlation, r, between two variables that the survey 
should be able to detect for a specified population domain, or subdomain of analytic 
interest. The MSSE should be based on a broad knowledge of the field, related theories, 
and supporting literature. 

Multiplicity estimation is a technique used to adjust selection probabilities when the unit 
of interest has multiple chances of being selected. For example, in a random-digit dialing 
household survey, households with multiple phone numbers have a probability of being 
selected more than once. In this case, by identifying the number of distinct telephone 
numbers in a household, the sampling weights can be adjusted to generate an unbiased 
household weight. 

-N- 

A Native Hawaiian or Other Pacific Islander person has origins in any of the original 
peoples of Hawaii, Guam, Samoa, or other Pacific Islands. 

New England County Metropolitan Areas (NECMAs) are county-based alternatives to 
the city- and town-based metropolitan areas that are used in the rest of the country. The 
NECMA for an MSA or CMSA includes (1) the county containing the city named first in 
that MSA/CMSA title (this county may include the cities named first for other 
MSAs/CMSAs), and (2) each additional county having at least half its population in the 
MSA/CMSA(s) whose cities that are listed first are in the county identified in step 1. 
NECMAs are not defined for individual PMSAs. 

Noncoverage involves eligible units of the target population that are missing from the 
frame population; this includes the problems of incomplete frames and missing units. 
Nonresponse bias occurs when the observed value deviates from the population parameter 
due to differences between respondents and nonrespondents. Nonresponse bias is likely to 
occur as a result of not obtaining 100 percent response from the selected cases. 
Nonsampling error includes measurement errors due to nonresponse, coverage, 
interviewers, respondents, instruments, processing, and mode. 

-o- 

An Other Pacific Islander or Native Hawaiian person has origins in any of the original 
peoples of Hawaii, Guam, Samoa, or other Pacific Islands. 

Overall unit nonresponse reflects a combination of unit nonresponse across two or more 
levels of data collection, where participation at the second stage of data collection is 
conditional upon participation in the first stage of data collection. 

Overcoverage errors occur when units are duplicated or enumerated in error. 

-P- 

Perturbation disclosure limitation techniques directly alter the individual respondent’s 
data for some variables, but preserve the level of detail in all variables included in the 
microdata file. Blanking and imputing for randomly selected records; blurring (e.g., 
combining multiple records through some averaging process into a single record); adding 
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random noise; and data swapping or switching (e.g., switching the sex variable from a 
predetermined pair of individuals) are all examples of perturbation techniques. 

In a pilot test a laboratory or a very small-scale test of a questionnaire or procedure is 
conducted. 

A planning document includes a justification for a study, a description of the survey 
design and methodology, an analysis plan, a survey evaluation plan, and a cost estimate. 

The potential magnitude of nonresponse bias can be estimated by taking the product of 
the nonresponse rate and the difference in values of a characteristic between respondents 
and nonrespondents. 

The power ( 1 -(3) of a test is defined as the probability of rejecting the null hypothesis when 
a specific alternative hypothesis is assumed. For example, with () = 0.20 for a particular 
alternative hypothesis, the power is 0.80, which means that 80 percent of the time the test 
statistic will fall in the rejection region if the parameter has the value specified by the 
alternative hypothesis. 

Precision of survey results refers to how closely the results from a sample can reproduce 
the results that would be obtained from a complete count (i.e., census) conducted using the 
same techniques. The difference between a sample result and the result from a complete 
census taken under the same conditions is known as the precision of the sample result. 

A survey pretest involves experimenting with different components of the questionnaire or 
survey design or operationalization prior to full-scale implementation. This may involve 
pilot testing, that is a laboratory or a very small-scale test of a questionnaire or procedure, 
or a field test in which all or some of the survey procedures are tested on a small scale that 
mirrors the planned full-scale implementation. 

A point estimate involves using the value of a particular sample statistic to estimate the 
value for a parameter of interest. 

Primary Metropolitan Statistical Areas (PMSAs) are the component areas of a CMSA. 
If no PMSAs are recognized, the entire area is designated an MSA. 

The probability of selection is the probability that an element will be drawn in a sample. 
In a simple random selection, this probability is the number drawn in the sample divided by 
the number of elements on the sampling frame. 

A public-use data file includes a subset of data that have been coded, aggregated, or 
otherwise altered to mask individually identifiable infonnation, and thus, is available to all 
external users. Unique identifiers, geographic detail, and other variables that cannot be 
suitably altered are not included in public-use data files. 

Public-use edits are based on an assumption that external users have access to both 
individual respondent records and secondary data sources that include data which could be 
used to identify respondents. For this reason, the editing process is relatively extensive. 
When determining an appropriate masking process, the public-use edit takes into account 
and guards against matches on common variables from all known files that could be 
matched to the public-use file. 

-R- 

Raking is a method of adjusting sample estimates to known marginal totals from an 
independent source. For a two-dimensional case, the procedure uses the sample weights to 
proportionally adjust the weights to one set of marginals. Next, these adjusted weights are 
proportionally adjusted to the second set of marginals. This two-step adjustment process is 
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repeated a number of times until the adjusted sample weights converge simultaneously to 
both sets of marginals. 

A random-digit dial sample survey randomly selects respondents based on a sample of 
phone numbers and infonnation obtained using a screener questionnaire. 

The reference year is the year about which the data were collected. 

The rejection region is defined by the alternative hypothesis H\ and the a level. If the test 
statistic is in this region, the null hypothesis is rejected. 

Reliability is the degree to which test scores for a group of test takers are consistent over 
repeated applications of a measurement procedure and hence are inferred to be dependable 
and repeatable for an individual test taker. 

Replication methods are approximate variance methods that estimate the variance based 
on the variability of estimates formed from subsamples of the full sample. The subsamples 
are generated to properly reflect the variability due to the sample design. 

Required response items include the minimum set of items required for a case to be 
considered a respondent. 

Response rates calculated using base weights measure the proportion of the sample frame 
that is represented by the responding units in each study. 

A restricted-use data file includes individually identifiable information that is confidential 
and protected by law. Restricted-use data files are not required to include variables that 
have undergone coarsening disclosure risk edits. 

-S- 

Sampling error is the error associated with nonobservation, that is, the error that occurs 
because all members of the frame population are not measured. It is the error associated 
with the variation in samples drawn from the same frame population. The variance equals 
the square of the sampling error. 

Scaling refers to the process of assigning a scale score based on the pattern of responses. 
Scoring/rating is the process of evaluating the quality of the examinee’s responses to 
individual cognitive questions. 

Section 504 of the Rehabilitation Act of 1973, as amended (Title 29 U.S.C. 794 Section 
504), prohibits discrimination on the basis of handicap in federally assisted programs and 
activities. 

Simple comparison is a test (such as a t test or a z test), of the difference between two 
means or proportions. 

Simple Random Sampling (SRS) uses equal probability sampling with no strata or clusters 
Most statistical analysis software assumes SRS and independently distributed errors. 

Stage of data collection includes any stage or step in the sample identification and data 
collection process in which data are collected from the identified sample unit. This includes 
infonnation obtained that is required to proceed to the next stage of sample selection or 
data collection (e.g., school district permission for schools to participate or schools 
providing lists of teachers for sample selection of teachers). 

Statistical disclosure limitation techniques are used to prepare microdata files for release; 
included are perturbation techniques and coarsening techniques. 

A statistical inference is a decision about one or more unknown or unobserved population 
parameter(s) based on estimation and/or hypothesis testing. 
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Strata are created by partitioning the frame and are generally defined to include relatively 
homogeneous units within strata. 

Substitutions are done using matched pairs, in which the alternate member of the pair does 
not have an independent probability of selection. 

A supplemental area frame can be created. This is often done by first, generating a frame 
of geographic units where all the geographic units are represented, providing full 
geographic coverage. Next, a probability sample of the geographic units is selected. An 
intensive search procedure is carried out in each selected area. This generates a 
supplemental area frame for each selected area. Assuming no error in the search process, 
the supplemental area frame has complete coverage and the cases can be weighted to 
represent a national estimate. The data from both the main list frame and the supplemental 
area frame are then combined so that the weighted sample estimates provide complete 
coverage. 

An individual survey is driven by one data collection form, such as the Private School 
Survey or the Academic Library Survey. 

A survey system is a set of individual surveys that are interrelated components of a data 
collection, such as the Schools and Staffing Survey or the Integrated Postsecondary 
Education Data System. 

The survey year is the year in which the data were collected. 

-T- 

The tail of the sampling distribution of the test statistic contains the rejection region for the 
hypothesis tested, Ho. 

The target population is the finite set of observable or measurable elements (i.e., sampling 
units) that will be studied. 

Taylor-series linearization is an approximate variance method in which an estimate is 
linearized as a first step. The variance of the linearized estimate is then computed using 
either an exact or approximate variance fonnula appropriate for the sample design. 

Total nonresponse reflects a combination of the overall unit nonresponse and item 
nonresponse for a specific item. 

Type I error is made when the tested hypothesis, Ho, is falsely rejected when in fact it is 
assumed true. The probability of making a Type I error is denoted by alpha (a). For 
example, with an alpha level of 0.05, the analyst will conclude that a difference is present 
in 5 percent of tests where the null hypothesis is true. 

Type II error is made when the null hypothesis, Ho, is not rejected when in fact a specific 
alternative hypothesis, H\ is assumed true. The probability of making a Type II error is 
denoted by beta (J3). For example, with a beta level of 0.20, the analyst will conclude that 
no difference is present in 20 percent of all cases in which the specific hypothesized 
alternative, H\ is true. 

-u- 

Undercoverage errors occur when target population units are missed during frame 
construction. 

Un-duplication involves the process of deleting units that are erroneously in the frame 
more than once to correct for overcoverage. 
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Unit nonresponse occurs when a respondent fails to respond to all required response items 
(i.e., fill out or return a data collection instrument). 

A universe survey involves the collection of data covering all known units in a population 
(i.e., a census). 

-V- 

Validity is the extent to which a test or set of operations measures what it is supposed to 
measure. Validity refers to the appropriateness of inferences from test scores or other 
forms of assessment. 

Variance is the error associated with nonobservation, that is, the error that occurs because 
all members of the frame population are not measured. It is the error associated with the 
variation in samples drawn from the same frame population. The variance equals the 
square root of the sampling error. 

-W- 

A wave is a round of data collection in a longitudinal survey (e.g., the base year and each 
successive follow-up are each waves of data collection). 

A White person has origins in any of the original peoples of Europe, the Middle East, or 
North Africa. 
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APPENDIX A: Race and Ethnicity 


All combinations of 5 races and 1 ethnicity (64 combinations) 



Hispanic or Latino 

Not Hispanic or Latino* 

Single race 



White 

1 

33 

Black or African American 

2 

34 

Asian 

3 

35 

American Indian or Alaska Native 

4 

36 

Native Hawaiian or Other Pacific Islander 

5 

37 

Combination of two races 



White and Black or African American 

6 

38 

White and Asian 

7 

39 

White and American Indian or Alaska Native 

8 

40 

White and Native Hawaiian or Other Pacific Islander 

9 

41 

Black or African American and Asian 

10 

42 

Black or African American and American Indian or Alaska Native 

Black or African American and Native Hawaiian or 

11 

43 

Other Pacific Islander 

12 

44 

Asian and American Indian or Alaska Native 

13 

45 

Asian and Native Hawaiian or Other Pacific Islander 

14 

46 

American Indian or Alaska Native and Native Hawaiian or Other 

Pacific Islander 

15 

47 

Combination of three races 



White and Black or African American and Asian 

16 

48 

White and Black or African American and American Indian or Alaska 
Native 

17 

49 

White and Black or African American and Native Hawaiian or Other 
Pacific Islander 

18 

50 

White and Asian and American Indian or Alaska Native 

19 

51 

White and Asian and Native Hawaiian or Other Pacific Islander 

20 

52 

White and American Indian or Alaska Native and Native Hawaiian or 
Other Pacific Islander 

21 

53 

Black or African American and Asian and Native Hawaiian or Other 
Pacific Islander 

22 

54 

Black or African American and Asian and American Indian or Alaska 
Native 

23 

55 

Black or African American and Native Hawaiian or Other Pacific 

Islander and American Indian or Alaska Native 

24 

56 

Asian and Native Hawaiian or Other Pacific Islander and American 
Indian or Alaska Native 

25 

57 

Combination of four races 



White and Black or African American and Asian and American Indian 
or Alaska Native 

26 

58 

White and Black or African American and American Indian or Alaska 
Native and Native Hawaiian or Other Pacific Islander 

27 

59 

White and Asian and American Indian or Alaska Native and Native 
Hawaiian or Other Pacific Islander 

28 

60 

White and Black or African American and American Indian or Alaska 
Native and Native Hawaiian or Other Pacific Islander 

29 

61 

Black or African American and Asian and American Indian or Alaska 
Native and Native Hawaiian or Other Pacific Islander 

30 

62 

Combination of five races 



White and Black or African American and Asian and American Indian 
or Alaska Native and Native Hawaiian or Other Pacific Islander 

31 

63 

No race specified or refused 

32 

64 


* Includes not reported. 


139 






140 



APPENDIX B: Imputation 
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EVALUATING THE IMPACT OF IMPUTATIONS FOR ITEM NONRESPONSE 

Marilyn Seastrom, Steve Kaufman, Ralph Lee 

An incomplete data record for a survey respondent results in item nonresponse that cannot be 
ignored. Survey nonresponse can result in an increase in the mean square errors of survey 
estimates and a distortion of the univariate and multivariate distributions of survey variables, and 
thus may result in biased estimates of means, variances, and covariances (FCSM 2001). 


Measuring Bias 

The degree of nonresponse error or bias is a function of two factors: the nonresponse rate and how 
much the respondents and nonrespondents differ on survey variables of interest. For example, in 
the case of item nonresponse on family income, a comparison of the characteristics of the 
respondents and nonrespondents on other items that were completed by the item nonrespondent 
can be used to assess whether there are any systematic differences. In the case of our example, 
parent’s education, parent’s occupation, and race-ethnicity (or a longer list) might be good 
candidates to examine for an indication of the amount of bias associated with the missing income 
data. 


The mathematical formulation to estimate bias for a sample mean is: 


4 ,)= 


y r -y t = 




\ n ,j 


(y,-y J 


where: 

y t = the mean based on all sample cases, using the base weight 
y r = the mean based only on respondent cases, using the base weight 


y m = the mean based only on nonrespondent cases, using the base weight 
n t = the number of cases in the sample (i.e., n, = n r + n m ), using the base weight 
n m = the number of nonrespondent cases, using the base weight 
n r = the number of respondent cases, using the base weight 


y r is approximately unbiased if either the proportion of nonrespondents ( n m /n ) is small or the 
nonrespondent mean, y m , is close to the respondent mean, y r . 

The relative bias provides a measure of the magnitude of the bias: 

Rel r(v,.)=^J 

y r 
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Where: 


Rel f?(y,. ) = the relative bias with respect to the estimate, 


y. 


The bias ratio provides an indication of how confidence intervals are affected by bias: 


Bias Ratio = 



< 7 - 

r 


Where: 

<7 = the standard error. 

r 

Next, since the estimate total for variable y is the sum of the estimates for the respondents and the 
nonrespondents: 

yt = v r + Vm 

which is also equal to the product of the number of respondents times the mean value for the 
respondents added to the number of nonrespondents times the mean value for nonrespondents: 


v, y r + y m n y r + n m y m 


The bias for the estimate of a total, y r , is: 

B (y r )=y, -yt = -y m = -n m y m 

Thus, the bias is small if the number of nonrespondents is small or if the mean for nonrespondents 
is low. 

The bias for an estimate of variance is: 


b(n) = 


( n ^ 

m 

( 2 2 V 


f n r 1 

( n ^ 

m 

(- - ) 2 

\ n < J 

IN -s m )\ 
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\y r - y,„ ) 


Note first that the first term is similar to the equation for the bias of the mean, in that it is the 
product of the nonresponse rate and a difference — in this case the difference is that between the 
variance of the respondents and the nonrespondents. The second term is the product of the 
response rates for respondents and nonrespondents and the squared difference between the means 
for the respondents less the nonrespondents. 
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Suppose the variances for respondents and nonrespondents are similar (a more reasonable 
assumption than assuming this for the means), then the nonresponse rate times zero or a small 
difference is negligible. When this is the case, the bias in the variance is a function of the product 
of the response and nonresponse rates and the contribution from the squared difference in the mean 
values for respondents less nonrespondents. In other words, the bias in the variance is a function 
of the amount of nonresponse and the difference in the means for respondents and nonrespondents 
and it will always result in an underestimate of the variance. 

Consider the example in which the variance is the same for respondents and nonrespondents and 
the response rate is 70 percent. The bias formula reduces to the second term: 



The product of the response rates is .21 and the squared difference of the means, some value z, will 
be positive regardless of which mean is larger. The bias is then equal to: 

B(s;)= -,21(z). 

If the variances of the respondents and nonrespondents are the same, the variance will always be 
underestimated. 

However, in some cases the variances associated with respondents and nonrespondents may not be 
equal. For example, consider the case of income reporting where nonrespondents are likely to be 
concentrated at the upper and lower ends of the distribution, leaving the respondents more 
clustered in the middle. It will result in a larger variance associated with the nonrespondents than 
the variance for the respondents. Thus the difference between the two variances will be negative. 
Continuing with the earlier example, the bias for an estimate of the variance becomes: 

B(s, 2 )= -30(-j) - [.2 1 (z)] = .30(-j) - ,21(z) 

Where j is the difference between the two variance estimates. Again, the variance is 
underestimated. In fact, j is likely to always be smaller than z, since variances decrease as the 
sample size increases. However, the differences in the means are not affected by sample size and, 
as a result, are likely to be larger in large-scale surveys. Thus, more of the bias is due to the 
differences in the means and the variance will always be underestimated. 

The bias for an estimate of covariance is: 
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Consider the case where respondents are defined as those who answered both y and a second 
variable x. Here r'is the number of cases with answers to both x and y, with the prime used to 
indicate the joint response. If s ' = s' the covariance is not necessarily underestimated. When 
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the estimates of covariance are equal for respondents and nonrespondents, the bias will be negative 

(i.e., an underestimate of the covariance); if the signs on (x r - x m ) and (v -y m ) are both positive 

or both negative the bias will be negative and the covariance will be underestimated. On the other 
hand, if these two terms have opposite signs the bias will be positive and the covariance will be 
overestimated. 

The Problem With Ignoring Item Nonresponse 

The reason item nonresponse cannot be ignored is because once it exists, any analysis of the data 
item requires either an implicit or explicit imputation. To ignore the missing data and restrict 
analyses to those records with reported values for the variables in the analysis, implicitly invokes 
the assumption that the missing cases are a random subsample of the full sample, that is, they are 
missing completely at random (MCAR). This means that missingness is not related to the variables 
under study. This requires that all respondents are equally likely/unlikely to respond to the item 
and that the estimate is approximately unbiased. These are strong assumptions. As noted by Brick 
and Kalton, 1996, “The use of imputation can improve on this strategy.” 

Little and Rubin included a discussion of “Quick Methods for Multivariate Data with Missing 
Data” in their 1987 book Statistical Analysis with Missing Data. In introducing these methods 
they state “Although the methods appear in statistical computing software and are widely used, we 
do not generally recommend any of them except in special cases where the amount of missing data 
is limited.” Included in this discussion are complete-case analyses where only the cases with all 
variables specified in the analysis are included (i.e., the number of cases is fixed for all variables in 
an analysis) and available-case methods that include all cases where the variable of interest is 
present (i.e., the sample base changes from variable to variable). They conclude this discussion by 
stating “Neither method, however, is generally satisfactory.” 

Lessler and Kalsbeek also explored a variety of imputation methods in their 1992 book, 
Nonsampling Errors in Surveys. While they caution that there is no substitute for complete 
response, “. . .it is better when attempting to reduce nonresponse bias to use a well-chosen method 
than to do nothing at all, unless the rate of nonresponse is low.” 

Examples 

A few numerical studies can help illustrate this point. Lessler and Kalsbeek (1992) reported on a 
1978 analysis that they conducted on data from the National Assessment of Educational Progress 
(NAEP). Their goal was to measure the effect of nonresponse on 17-year-old students, since they 
have lower response rates than the 13- or 9-year-old students. Their comparison of data from a 
subsample of nonresponding 17-year-olds with data from the original group of sample respondents 
showed that the size of the nonresponse bias relative to the variance component of most estimates 
in this survey was high. They noted that since bias does not depend on sample size, but variance 
diminishes as the sample size increases, nonresponse bias tends to be significant for large surveys. 
They also observed a direct relationship between the extent of nonresponse bias and a lowering of 
the actual confidence levels. 

A second example may be drawn from “A study of selected nonsampling errors in the 1991 Recent 
College Graduates Study” (Brick 1994). The estimate of interest is the percent of graduates with a 
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bachelor’s degree who are education majors. Although technically the institution is the first stage 
of sample selection and the graduate is the second stage, for the purposes of this example the 
institution will be taken as the respondent and the item nonresponse is determined by whether the 
graduate responded or not. The institution response rate of 95 percent is posited to allow for a 
relatively accurate estimate of the item nonresponse bias. 

The nonresponse rate for graduates was 16.4 percent. The institutions reported data showing that 
7.79 percent of the nonrespondents majored in education, compared to 10.54 percent of the 
respondents. The bias can be estimated as: 

[,164*(.1054 - .0779)] = .00451 = 0.5% 

In other words, if the estimate were based only on the respondents, it would overestimate the 
percentage who are education majors by one-half a percent. 

The relative bias with respect to the estimate, is: 

(.00451/. 1054) = .0428 = 4.3% 

Thus, the bias is relatively small in this case. However, when the bias ratio is considered, a 
different picture emerges. In general, a bias ratio of 10 percent or less has little effect on 
confidence intervals or test of significance. That is to say, with a bias ratio of 10 percent, the 
probability of an error of more than 1.96 standard deviations from the mean is only 5.11 percent, 
compared with the usual 5 percent (table 1). In the graduate example, when the estimate of bias is 
compared to the standard error, the bias ratio is: 

(.00451/.0003047) = 14.8 = 148% 

The bias ratio of 148 percent means that there is a 32 percent chance of a Type I error (i.e., 
rejecting a true hypothesis) in computing the confidence interval or conducting a significance test 
in this example. 

This bias ratio is so large because the estimated standard error is small, as is typically the case with 
large sample sizes. Thus, although the actual bias and the relative bias are relatively small, the bias 
ratio illustrates the fact that the impact on statistical inferences can still be quite large. This has 
important implications for federal statistical agencies that conduct large sample surveys. 

If we assume that the variance associated with the estimate of education majors is the same for 
respondents and nonrespondents, then the bias of the variance estimate in this example is: 

B (s 2 r ) = - [(. 1 64)(.836)](. 1054 - .0779) 2 = - .000104 
The variance in this example is underestimated by .01 percent. 
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Table 1. Bias ratio by size of probability of a Type I error 


Bias ratio 
(percent) 

Probability of 
Type I error 

2 

.0500 

4 

.0502 

6 

.0504 

8 

.0508 

10 

.0511 

20 

.0546 

40 

.0685 

60 

.0921 

80 

.1259 

100 

.1700 

150 

.3231 


SOURCE: Cochran, W.G. (1977). 

Sampling Techniques. New York: Wiley. 

Explicit Methods of Imputing for Item Nonresponse 

The alternative to ignoring missing item responses is to adopt a strategy to “fill-in,” or in other 
words, impute the missing responses. A number of different methods have been proposed and 
used in survey research. Before discussing the specific methods and the relative advantages and 
disadvantages of each one, it is worthwhile to consider the pros and cons of explicit imputations in 
general. 

Most authors in this area caution that imputations carry both potentially positive and negative 
outcomes. For example, Kalton and Kasprzyk (1982) identified three positive aspects of explicit 
imputations. They are intended to reduce biases from item nonresponse in sample survey data. By 
filling in the holes, they allow analyses to proceed as though the data set were complete, thus 
making analysis easier to conduct and results easier to report. They result in consistent results 
across analyses, because all analysts should be working with the same set of “complete” cases. 
They also identified potential drawbacks. They cautioned that imputation methods do not 
necessarily lead to a reduction in bias, relative to the incomplete data set. And, they warned against 
the danger of analysts treating the “complete” cases as actual responses, thus overstating the 
precision of the survey estimates. Brick and Kalton (1996) concur with these statements and add 
that imputation methods may also distort the association between variables. They note that 
although methods can be selected to maintain the associations of the variable subject to imputation 
with certain key variables, associations with other variables may be attenuated. 

Imputations can be categorized along two dimensions. First, by whether they are deterministic or 
stochastic. In the case of detenninistic imputations, the residual term is set to zero. This yields the 
best prediction of the missing value, however it results in an attenuation of the variance of the 
imputed estimate relative to that of the unobserved estimate and it distorts the distribution of the 
values of the item in question. Thus, deterministic imputations give more precise estimates of 
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means (e.g., an average score), but produce biased estimates of distributions (e.g., the percent of 
students scoring above a certain point). In stochastic imputations, the residual or error term is 
randomly assigned. This addition of random noise improves the shape parameters by yielding 
more realistic distributions. Brick and Kalton (1996) concluded that given “the importance of 
shape parameters in many analyses, stochastic imputations are generally preferred.” 

The second dimension has to do with whether or not auxiliary variables are used in the imputation 
method. Within the set of imputation methods that use auxiliary variables, they may be either 
categorical, categorizing sample members into imputation classes, or they may be continuous, as in 
the case of regression imputation methods. 

As mentioned earlier, a number of different types of imputation methods have been developed and 
used in survey research. For example, a partial listing includes historical imputation, deductive 
imputations, mean imputations, random imputation, overall mean imputations within classes, 
random imputation within classes, hot-deck imputation, cold-deck imputation, flexible matching 
imputation, ratio imputation, predicted regression imputation, random or stochastic regression 
imputation, EM algorithm imputation, distance function matching, composite methods, Bayesian 
Bootstrap imputation, and multiple imputation methods. There are a number of sources that 
review the methods and properties of these varied imputation techniques (Little and Rubin 1987; 
Kalton 1983; Kalton and Kasprzyk 1982, 1986; Lessler and Kalsbeek 1992; Hu, Salvucci, and 
Cohen 2000). 

The rest of this discussion will focus on those methods that are either currently used at NCES or 
the most promising alternatives for future work. 

Table 2, taken from Hu et al. (forthcoming 2004), shows the imputation methods used in recent 
NCES data collections. In the case of the universe data collections (CCD, PSS, IPEDS) the 
imputation methods most used include ratio imputation, mean imputation, and cold-deck 
imputation. In a few cases deductive or logical imputations are employed, and hot-deck 
imputation methods are also used in a few cases. Historical imputations should be added to this 
list, inasmuch as they are used in the Digest of Education Statistics and perhaps in the Condition of 
Education. 

The sample survey data collections primarily use sequential hot-deck imputation along with 
deductive imputations. There has also been limited use of within-class random imputation, 
regression imputation, multiple imputation, and a few of the methods listed above under universe 
data collections. 

Deductive or logical imputations 

Sometimes the value of a missing item can be logically deduced with certainty from responses to 
other items. It is unclear whether this should be considered a form of imputation or a form of data 
editing. If strict rules of logic are followed, then the value is clear and has no impact on any of the 
resulting statistics. While deductive imputation is the ideal form of imputation, it is frequently not 
possible. Some argue that these data corrections are best treated as edits. 


149 



Table 2. Imputation methods employed in NCES data collections 


Survey 

Imputation methods 

CCD 

Ratio imputation and adjustment 

PSS 

Ratio adjustment, deductive and sequential hot-deck imputation 

IPEDS-IC 

Mean and ratio imputation 

IPEDS-EF 

Mean and ratio imputation 

IPEDS-C 

Mean, ratio, and cold-deck imputation 

IPEDS-SA 

Within class mean and ratio imputations 

IPEDS-F 

Ratio adjusted cold-deck and sequential hot-deck imputation 

IPEDS-S 

Ratio adjusted cold-deck and hot-deck imputation 

IPEDS-L 

Logical imputation, ratio adjustment 

IPEDS-ALS 

Ratio and cold-deck imputation 

NSOPF 

Sequential hot-deck and within-class random imputation 

SASS 

Deductive and sequential hot-deck imputation 

SASS-TFS 

Deductive and sequential hot-deck imputation 

RCG 

Deductive, hot-deck, and within-class random imputation 

NHES 

Manual and hot-deck imputation 

NPSAS 

Deductive, hot-deck, and regression imputation 

FRSS 

Mean, median, and sequential hot-deck imputation 

PEQIS 

Ratio adjustment and sequential hot-deck imputation 

NAEP 

Multiple imputation based on Bayesian models for scores 

TIMSS 

Multiple imputation based on Bayesian models for scores 


Historical imputations 

Historical imputations are used for variables that tend to be stable over time (e.g., the number of 
teachers in a state). This method uses previously reported data from the same unit to impute for 
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missing data in a current data collection. This method attenuates both the size of trends and the 
incidence of change. A variation on this method helps correct for these problems, by using some 
measure of trend, frequently derived from other cases. 

This method works best when the relationship over time is stronger than the relationship between 
variables at one point in time. 

Cold-deck imputation 

Cold-deck imputation uses a constant value from a source external to the current data collection to 
“fill-in” the missing item. Frequently a previous iteration of the same survey serves as the external 
source. Little and Rubin (1987) acknowledge that current practice is to ignore these imputations, 
treating these data as a complete sample. They go on to state that there is no satisfactory theory for 
the analysis of data obtained by cold-deck imputation. Lessler and Kalsbeek (1992) describe cold- 
deck imputation as being of historical interest, but rarely used in practice. This method seems to 
be very close to historical imputations. 

Mean value imputation 

Mean value imputation uses the mean of the reported values to “fill-in” the missing value. In the 
case of overall mean value imputation, the mean is taken from the entire distribution, while in 
within-class mean value imputation the mean is taken from the specific imputation class. (Median 
value imputation is very similar, using the median of the reported value.) 

This method can only provide unbiased estimates for means and totals if the missing values meet 
the strong assumption of missing completely at random. Because this procedure creates a spike at 
the mean value, it does not preserve the distribution or the multivariate relationships in the data. 
Furthennore, because the sample size is effectively reduced by nonresponse, standard variance 
formulas will underestimate the true variance. Overall mean value imputation is not recommended. 
Kovar and Whitridge (1995) caution that if all else fails, within-class mean value imputations can 
be used with carefully chosen classes for means and totals, but that it does not work for other 
statistics. Hu et al. (forthcoming 2004) point out that if the missing values depend on any variables 
not included in the auxiliary variables used to form the imputation class, the means and totals will 
be biased, the distribution will be distorted, and the variances will be substantially underestimated. 
Little and Rubin (1987) make the point that the distortion of the distribution is particularly 
problematic when the tails of the distribution or the standard errors of the estimates are the focus of 
study. 

Ratio imputations 

Ratio imputations, like within-class mean value imputations, use auxiliary variables that are 
closely related to the variable to be imputed and that have data available for all or nearly all of the 
sampled units. The imputed value for case i is obtained by multiplying the ratio of the mean for the 
responding cases for the variable to be imputed to the mean of all cases for the auxiliary variable 
times the case i value for the auxiliary variable. The requirement for a highly correlated auxiliary 
variable can yield accurate imputations, but it is more often the case that the variable to be imputed 
is correlated to several auxiliary variables. Thus a ratio imputation that is, by definition, tied to 
one auxiliary variable is not fully efficient. In addition, if the auxiliary item is identical across 
several units used in the imputation, the related imputed items will mirror that pattern, thus 
distorting the distribution of the imputed variable. 
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It is important to note here that the ratio imputations used by at least some NCES data collections 
do not follow this description exactly. Instead, what is done for example with state-level fiscal 
data in CCD, is to partition the responding cases, remove the value of the variable in question from 
the total for each state, compute the ratio of the value for each responding state to their reduced 
total, compute the average of these ratios across all responding states, and then multiply the total 
for each state with missing data by the average ratio. 

Regression imputation 

Predicted regression imputation is very closely related to the ratio imputation approach, the 
primary difference being that a set of highly correlated auxiliary variables are used to predict 
missing values in the imputed variable. In this case the imputed values are only as good as the 
model used to predict them. Random regression imputation follows the same procedures used in 
predictive regression imputation, with the addition of a stochastic component through the residual 
terms. There are several alternative assumptions that can be used to define the way these residual 
terms are generated in an imputation procedure — normally distributed, chosen at random from the 
respondent’s residuals, or chosen at random from respondents who are similar on the auxiliary 
variable. One drawback that is unique to regression imputations is their ability to yield improbable 
results. 

In this case, as in other fonns of imputation, the component of variance that is attributable to 
survey nonresponse is not accounted for in standard variance estimation software, resulting in an 
underestimation of the true variance. 

Hot-deck imputation 

Hot deck originally got its name from the decks of computer cards that were used in processing 
data files, with the term hot referring to the same data file. There is actually a class of imputation 
procedures that share this label. The common thread is that missing values are replaced one at a 
time with an available value from a similar respondent in the same study. This general approach is 
probably the most widely used imputation method. One of the reasons there is variability among 
types of hot-deck methods, is that its popularity has caused it to evolve. In general, the procedure 
starts with a set of imputation classes and the cases within each class are processed and compared. 
This procedure preserves the distribution of the estimates, and increases the variance relative to the 
mean imputation method. Thus, the underestimation of the variance of the estimate is decreased. 

In the case of the sequential hot-deck imputation, each class starts with a single value for the item 
subject to imputation. Each record is compared to that item, and if the record has a value for that 
item, it replaces the starter value; on the other hand, if the record is missing that item, the starter 
value or the value that has replaced it is “filled-in” on the case with the missing value. One 
problem occurs with this approach when several records with missing values occur together on the 
file. This results in the current donor value being assigned to multiple records, thus leading to a 
lack of precision in the survey estimates (Kalton and Kasprzyk 1986). A variation on this approach 
is known as random imputation within classes, the difference here being that the donor respondent 
is chosen at random within the imputation class for assignment to the nonrespondent. Lessler and 
Kalsbeek (1992) pointed out that if this is done with replacement, the multiple use of a donor 
problem persists; however, they also noted that this can be avoided by sampling without 
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replacement. While this procedure is more cumbersome, it has the advantage of providing a basis 
to correctly formulate the mean square error of estimators using a hot-deck imputation. 

Another way to avoid the problems associated with sequential hot-deck imputation is the 
hierarchical hot-deck imputation. This method sorts respondents and nonrespondents into a large 
number of imputation classes based on a detailed categorization of a large set of auxiliary 
variables. Nonrespondents are then matched with respondents in the smallest class first; if no 
match is found that class is collapsed with the next one, and so on until a donor is found — hence 
the label hierarchical. 

As problems have been identified, alternative schemas have been devised to solve those problems. 
Regardless of the specifics, all hot-deck procedures take imputed values from a respondent in the 
same data file, thus yielding imputations that are valid, although not necessarily internally 
consistent for the respondent values. In order to evaluate the hot-deck imputation used for any 
specific data collection, detailed infonnation is required. 


Data Analysis With Imputed Data 

This brief review has highlighted the fact that imputed data sets can provide good estimates of 
means and totals, and that with some care and attention in the selection of the imputation method, 
the distributions can be reasonably well preserved. However, as Kovar and Whitridge (1995) point 
out, “The situation is not as favorable when it comes to estimates of variances and correlations.” 
They note that numerous studies have shown that imputations can have a deleterious effect on the 
statistics of the estimates. In particular, correlations between imputed variables are attenuated to 
varying degrees, but good auxiliary variables can help this problem (Santos 1981; Kalton and 
Kasprzyk 1982, 1986; and Little 1986). 

When standard fonnulas are used for the computation of statistics for estimates based on imputed 
data, the variances of estimated means and totals are underestimated (Rubin 1978). This 
underestimation occurs because standard computing software treats imputed values for missing 
data as observed data and thus, ignores the component of variance that is due to imputation. Kovar 
and Whitridge (1995) report that standard variance fonnulas underestimate the variance with 
imputations present by about 2 to 10 percent with a nonresponse rate of 5 percent and by as much 
as 10 to 50 percent with 30 percent nonresponse. The size of the underestimate varies with 
different types of imputation. 

Brick and Kalton (1996) discuss two methods for reducing imputation variance. The first method 
involves the use of sampling strategies. Selecting donors without replacement within each 
imputation class minimizes the multiple use of donors resulting in a lower imputation variance 
compared to sampling with replacement. When there is more than one respondent in a class, 
stratified sampling with a class or systematic sampling from an ordered list can also help reduce 
imputation variance. The second method relies on fractional imputation. With this approach 
individual respondent records are divided into parts, with weights distributed accordingly, and 
separate donors are chosen for each part of the respondent’s record. 
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The underestimation of the variance results in short confidence intervals and a tendency to declare 
significance when none exists. Sarndall (1992) demonstrated that these statistical problems 
become more severe as the amount of missing data increases. Lessler and Kalsbeek (1992) point 
out that the size of the nonresponse bias associated with totals, means, variances, and covariances 
is linked to differences between respondents and nonrespondents. 

There are several recently developed techniques designed to estimate the variance due to 
imputation. Rubin pioneered the use of multiple imputations in this arena, estimating the variance 
by replicating the process a number of times and then estimating the between replicate variances. 
Sarndall (1992) proposed a method using model-assisted estimators of variance. Rao and Shao 
(1992) use a method that corrects the usual jackknife variance estimator. Brick, Kalton, Kim and 
Fuller are currently under contract to NCES, conducting an evaluation of these new 
methodologies. The Statistical Standards Program at NCES is also supporting work by Aitken on 
an alternative approach using the EM algorithm. 

Despite these limitations and cautions associated with various imputation methods, Little and 
Rubin (1987) note that “It is important to emphasize that in many applications the issue of 
nonresponse bias is often more crucial than that of bias. In fact, it has been argued that providing a 
valid estimate of sampling variance is worse than providing no estimate if the estimator has a large 
bias...” 


Comparisons of Alternative Imputation Methods 

There are a number of extant studies comparing alternative imputation methods. Two of them were 
conducted using NCES data, and a third involving a set of simulations was supported by NCES. 

IE A Reading Literacy Study 

One example using NCES data from the U.S. component of the IEA Reading Literacy Study 
compared complete case (CC) analysis, available case (AC) analysis, hot-deck (HD) imputation, 
and the EM algorithm (EM) (Winglee et al. 1994). The first three methods were described above. 
The EM algorithm uses an iterative maximum likelihood procedure to provide estimates of the 
mean and variance-covariance matrix based on all available data for each respondent. The 
algorithm assumes the data are from a multivariate normal distribution and that, conditional on the 
reported data, the missing data are missing at random. To conduct this comparison, regression 
equations were estimated using the four methods of imputation. 

A linear regression model was used to predict a student’s perfonnance on a reading literacy test. 
The three reading scores used as the dependent variables were the narrative, expository, and 
document performance scores. These scores were derived using Item Response Theory models 
scaled for international comparison (Elley 1992). The predictor variables used in all models were 
gender, age, race, father’s and mother’s education, family structure, family composition, family 
wealth/possessions, and use of a language other than English at home. The amount of missing data 
ranged from 0 to 18 percent with 3 1 percent missing data for one or more variables. 


154 



Unweighted ordinary least squares regressions were run using each of the four imputation methods 
for the three independent variables. For each independent variable, the regression coefficients 
estimated using the HD, EM, and AC methods were very similar. The estimates using the CC 
analysis method were dissimilar. This analysis also used adjusted mean scores to examine the 
perfonnance of subgroups of students after controlling for other characteristics. The adjusted 
scores for a number of subgroups (e.g., gender, minority status, and parent’s education) showed 
mean scores using CC that were approximately 10 points higher than the mean scores using HD, 
EM, and AC. These differences are presumably explained by the fact that the CC analysis excludes 
the 3 1 percent of the students who had missing data on one or more items. 

This analysis was repeated for a comparison of CC, AC, and HD using weighted data. Although 
the use of the weights reduced the size of the gap somewhat, the differences persisted, with the CC 
analysis method yielding higher estimates than the AC and HD methods (which yielded similar 
results). The authors of this report concluded that the CC analysis method was clearly inefficient. 
Rather than the missing cases being randomly distributed, they found evidence that the students 
with missing data differed from those with complete data in reading performance, race/ethnicity, 
type of community, region of the country, and control of the school. They further concluded that 
given the similarity of results between the remaining three methods (AC, HD, and EM), since the 
HD method is the easiest to implement it is the best to use for the IEA study. 

NELS.-88 

The second example from the analysis of NCES data uses data from the National Education 
Longitudinal Study of 1998 (NELS:88) to compare two imputations methods that were used for 
test scores — within-class random hot-deck imputation and model-based random imputation 
(Bokossa, Huang, and Cohen 2000). The goal of this study was to select an imputation method to 
use to impute missing reading and math scores in the base year to second follow-up cohort. Sixty- 
five percent of the cohort took all four cognitive assessments in the three waves of the survey. The 
nonresponse rates by key demographic subgroups ranged from 20.5 to 27.5 percent, with the 
highest rates among minority students and low socioeconomic status (SES) students, causing some 
concern over potential bias in the NELS estimates of academic performance. 

The authors of this analysis first identified a set of auxiliary variables, and then using the subset of 
cases with complete cases, they simulated different levels and patterns of missingness assuming 
about 20 percent missing data. Following the simulation, the incomplete data were compared with 
the imputed data using the average imputing error, the bias of the variance, and the mean bias. The 
average imputation error was found to be consistently lower in the model-based approach 
compared to the hot-deck approach. 

Looking first at math, although a comparison of the bias of the mean across the two imputation 
methods and the incomplete data showed no consistent pattern, the means computed with the 
incomplete data were outperfonned by one or both of the other two imputation methods in all but 
one comparison (i.e., the bias was smaller for one of the other two methods). The relative bias of 
the variance was consistently smaller in the model-based approach than it was in the other two 
approaches. The same results were observed in reading. 
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The authors concluded that the model-based approach was the “preferred method” and proceeded 
to use PROC IMPUTE to implement the imputations for the NELS data set. 

Simulation Study 

In an NCES-sponsored simulation study, Hu, Salvucci, and Cohen (2000) used 6 evaluation 
criteria to compare 1 1 imputation methods for 4 types of distributions, 5 types of missing 
mechanisms, and 4 types of missing rates. The imputation methods evaluated include mean 
imputation, ratio imputation, sequential nearest neighbor hot-deck imputation, overall random 
imputation, mean imputation with disturbance, ratio imputation with disturbance, approximate 
Bayesian bootstrap, Bayesian bootstrap, modeling non-ignorable missing mechanism (PROC 
IMPUTE), data augmentation (Schafer’s software), and adjusted data augmentation method. 

The evaluation criteria used include bias of parameter estimates, bias of variance estimates, 
coverage probability, confidence interval width, and average imputation error. 

They found that the results varied across different types of missing data; the five types considered 
are missing completely at random (MCAR), tails more likely missing, large values more likely 
missing, center values more likely missing, tail values more likely missing with confounded 
(missingness in y depends on y itself). 

In the case where large values are missing, ratio imputation (with or without disturbances) and data 
augmentation (Schafer 1997) correct the bias in the mean; and within class random imputation and 
the sequential nearest neighbor hot-deck improved the biases substantially. However, the authors 
cautioned that the findings for ratio imputation may well be an artifact of their manipulation of the 
data. In summary, they note that although the improvement is much less when there is a right 
skewed distribution, in most cases these methods provide improvement when considerable biases 
exist in the means with the incomplete data. 

In summarizing the results for variance estimation, the authors concluded that all imputation 
methods studies, except the mean imputation method, yield acceptable variance estimates when the 
data are missing completely at random. For the three unconfounded types of missing data — tails 
missing, large values missing, and center missing — data augmentation (Schafer 1997) worked best, 
but ratio imputation, within class random imputation, and the sequential nearest neighbor hot-deck 
method all can improve the biases of variance estimates dramatically. (However, there is a caution 
that the ratio imputation method tends to overestimate the variance.) For the confounded missing 
data pattern, where the missingness is related to the variable itself, only the ratio imputation 
method (with and without disturbances) results in a substantial improvement in the bias of the 
variance. 

When coverage rates and confidence interval widths are considered together, data augmentation 
(Schafer 1997) and adjusted data augmentation are the least likely to provide bad estimates. 

Finally, when average imputation error is considered, ratio imputation, data augmentation (Schafer 
1997), and within class random imputation perform best, followed by hot-deck, ratio with 
disturbance, and mean imputation methods. 
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Looking across the entire set of results, data augmentation (Schafer 1997) is the one imputation 
method that scores high on all accounts. Two other methods that are more commonly used at 
NCES — within class random imputation (PROC IMPUTE) and the sequential nearest neighbor 
hot-deck method — also performed well in estimating means and variances and perform reasonably 
well on coverage rates and average imputation error (although within class random imputation 
(PROC IMPUTE) usually edges out the hot-deck method). 
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SAMPLE TABLE 


Table 6. Number of public high school completers, by state: TITLE 

School year 1999-2000 


Sltntp 


High school completers 


BOXHEAD: 

SPANNER HEAD 

COLUMN HEADER 


Total 

Diploma 

recipients 

Other high 
school 
completers 

High school 
equivalency 
recipients 1 

U.S. 

— 

2,546,102 

41,638 2 

— 

TABLE BODY 

Alabama 

43,459 

37,819 

2,535 

3,105 

(Universe data) 

Alaska 

7,968 

6,615 

53 

1,300 


Arizona 


38,304 

375 

t 


Wyoming 

— 

6,462 

27 

— 



— Not available. SPECIAL NOTES 

f Not applicable, no equivalency program. 

1 Includes recipients age 19 or younger, except in Minnesota, where 
they are age 20 or younger. 

2 Total other high school completers does not include New Hampshire, REFERENCE NOTES 

New Jersey, Washington, and Wisconsin. 

NOTE: High school completer categories may include students not GENERAL NOTE 

included in 12th-grade membership. 

SOURCE: U.S. Department of Education, National Center for Education SOURCE NOTE 
Statistics, Common Core of Data (CCD), "State Nonfiscal Survey of Public 
Elementary/Secondary Education," 2000-01. 
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INTRODUCTION 


Tabular presentation is a way to bring together and present related material in 
columns or rows. The object is to show in a concise and orderly manner information 
that could not be shown so clearly in any other way. To many users and potential 
users, however, columns and rows of figures are not easy to understand. Important 
facts and figures may be buried in the masses of data shown. To enable the 
inexperienced user to accurately interpret the data, and the experienced statistician to 
do so more readily, table design should be kept as simple and direct as the subject 
matter and available space allow. In general, good design is as simple as possible, 
focuses attention on the data, and makes their meaning and significance clear. Poor 
design obscures the meaning and distracts attention. 

A consistent “style” of presentation can help avoid distracting the user’s attention. 
Subtle differences in terminology may cause the perceptive reader to ponder if a 
difference in meaning is involved. So, one of the general standards of good 
presentation is to use the same tenninology in title, stub, headings, footnotes, etc. 

To that end, these guidelines stress the importance of table design to satisfy the needs 
of the user, not of the producer. A consistent style builds a “normal expectation” 
through uniform treatment of many details. Unaccountable variation may distract the 
user and weaken the user’s understanding of the content of the table. And by 
avoiding meaningless “differences,” the table producer can capitalize on meaningful 
differences, and strengthen understanding, when deliberate small changes are made in 
words, phrases, or table structure. 

The guidelines developed here attempt to adapt some widely accepted principles of 
tabular presentation to the subject matter, production methods, and operating 
procedures dealt with in NCES. Further, as with any set of guidelines, some arbitrary 
choice among acceptable alternatives is involved here. The guidelines are intended to 
help the development of clear and concise tabular presentations tailored to NCES 
needs. 


Much of the material in the 1972 NCES Guidelines for Tabular Presentation was adapted 
from the Census Manual. The Government Printing Office Style Manual and the Manual of 
Statistical Presentation (January 1970), prepared by the Division of Research Grants, 
National Institutes of Health, were also consulted for appropriate details. This 2002 edition 
draws heavily on the 1972 edition. Some of the revisions reflect technological changes. The 
Publication Manual of the American Psychological Association was also consulted for 
current practices. Beyond that, the modifications that have been made represent the 
experiences of a number of NCES analysts and contractors. 
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MAJOR TYPES OF TABLES 


There are three types of tables that are used in NCES publications. Taken in order of 
complexity, they are — summary tables, reference tables, and methodological tables. 

Summary /Ti ext Tables 

Summary, or text, tables focus on selected data to show important comparisons and 
relationships. In reports containing analytical text, these tables are often placed at or 
near the first textual reference to them because they are closely related to the 
discussion. If numerous, they may be grouped at the ends of chapters or at the end of 
the report, preceding the reference tables, if there are any. 

Reference Tables 

Reference tables are more detailed tables. Large quantities of information and 
comprehensive collections of data appear in reference tables. They normally fonn a 
separate section usually placed following the text at the end of the report. 

Sometimes, fairly short reference tables appear at the ends of chapters if summary 
tables are interspersed in the text. 

Methodological Tables 

Methodological tables contain standard errors or confidence intervals for data in a 
report. Place these tables in an appendix. 
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TABULAR FORMAT 


Printed Position 

Tables may be printed on the page in either portrait or landscape position in a variety 
of structural forms. In portrait tables, the words and data extend across the printed 
page (normal “width”), as these sentences do. Most tables present statistical data in 
this format. Landscape tables are rotated a quarter turn to the left, with the words and 
data extending up the page — the top of the table at the left, the bottom at the right. 
Landscape tables should be avoided if possible (particularly when interspersed in a 
report with the text and other tables in an upright position) because smooth transition 
is interrupted from text to table and from table to table. 

Single-Page Tables 

Occupying one page or less, these tables are easy to examine and highly desirable, 
especially as summary tables. If well designed, they convey easily grasped amounts 
of information as complete units. Frequently, careful pruning will allow a table that 
is either a little too long, a little too wide, or both to fit on a single page. 

Multi-Page Tables 

Although single-page tables are preferred, there are times when a table is too long to 
fit on one page; if these tables cannot logically be split into smaller tables, they must 
be continued on one or more additional pages. The title (with “ — Continued”) and 
the boxhead are repeated on successive pages of multi-page tables. The end of each 
page preceding the last page of a multi-page table should carry a note advising the 
reader to “See notes at end of table.” The notes for a multi-page table appear on the 
last page of a multi -page table. 

Double-Page-spread Tables 

The double-page spread is a special kind of portrait multi-page table that extends 
across facing pages, instead of one page, with about half of the column headings on 
each page. It may continue on successive facing pages. The entire stub should be 
repeated at the right side of the right-hand page; but if there is not enough room, line 
numbers may be used instead. (See Line Numbers.) The title is repeated on the 
second and subsequent pairs of pages (with “ — Continued”). Otherwise, the double- 
page spread is treated much like a one -page-size portrait table with the advantage of 
accommodating about twice as many columns. 

Hybrid Tables 

Two types of portrait tables that combine some of the aspects of both page- wide and 
double -page-spread tables are the “divide” and the “double-up” tables. 

Divide tables are portrait multi-page tables in which the title is repeated (with “ — 
Continued”), the stub is repeated on the left of each page, and the column heads 
continue across a second page or more. If only two pages wide, it may be set up on 
facing pages, like a double-page spread, and the stub may continue for any number of 
pages. The divide table is useful if the stub is only one page long but the table must be 
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three or more pages wide. It obviously cannot be both too long for one page and too wide for 
two. 

First page: Second page: Third page: 

Table 1. Title* * Table 1. Title * Continued Table 1. Title * Continued 


Stub- 

Column 

Stub- 

Column 

Stub- 

Column 

head 


heads 


head 


heads 


head 


heads 



A 

B 


\E 


E 

E 

G 

\K 


a 

a 

K 

E 


Double-up tables are set up somewhat like a double-page-spread table confined to 
one-page width. It is especially useful for a long table with few columns. It may 
continue as a multi-page table. The title occupies the width of the page, but the stub- 
head and column heads are repeated under it in the two halves, as shown. 


Table 1. Title * * * * 



* * * 


Name of 

Enrollment 

Name of 


Enrollment 

Institution 

Men Women 

Institution 


Men Women 

Or alternatively, 

Table 1. Title * * * * 

******** 

***** 

* * * 

****** 

Name of 

Enrollment 

Name of 


Enrollment 

Institution 

Men Women 

Institution 


Men Women 
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TABLE TITLES 


The formal tables (summary, reference, and methodological) have headings 
consisting of identification symbols (numerical or alphabetical); descriptive titles; 
and, sometimes, headnotes. 

Table Identifiers 

Tables in Executive Summaries should be lettered alphabetically, and tables in the 
body of the report should be numbered consecutively. For many reports, simple 
identifiers such as Arabic numerals in sequence — 1, 2, and so on — are the best 
solution. For example, most NCES reports have a short introductory text, with no 
chapter numbers needed; one series of tables, requiring identifiers; and one appendix, 
requiring none. However, distinguishing identifiers are needed for more than one 
series of tables (such as a few summary tables and reference tables) or more than one 
appendix. An orderly system that takes account of the table identifiers in relation to 
the other parts of a report is needed. Without this, much confusion would result in a 
publication with, for example, as many as three or more separate series of tables 
(summary, reference, and those in one or more appendixes) to distinguish them from 
or relate them with a series of charts, the appendixes themselves, and several 
chapters. 

Readily available for identifiers are Arabic numerals and the English alphabet in 
uppercase and lowercase. Arabic numerals are easiest to comprehend and can extend 
easily through any number of table titles. In addition, the tables within a particular 
series may have subseries that need to be related. For example, a main or “master” 
table may show particular data for all postsecondary institutions in the United States, 
followed by a subseries of tables showing identical kinds of data separately for 
universities, other 4-year institutions, and 2-year institutions. They should be 
numbered with a basic identifier and an appropriate suffix that is selected to avoid 
disrupting the standard numbering system and to bring out the table relationships, as 
shown in the following example: 

Table 5. All institutions 
Table 5-A. Universities 
Table 5-B. 4-year institutions 
Table 5-C. 2-year institutions 

The same scheme might be used for a frequency table showing basic figures followed 
by a table of percents or medians derived from the basic figures (tables 5 and 5-A). 

A slight variation may be used when component parts are shown separately in a series 
of tables without a master table. These tables are basically a single whole table that is 
split apart into a series of consecutive tables for convenience. They might be 
numbered 5-A, 5-B, etc. 

Table 5-A. Publicly controlled institutions 
Table 5-B. Privately controlled institutions 
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Appendixes should be lettered; and tables in appendixes should be assigned the letter 
of the appendix and a number suffix. For example, tables in appendix A should be 
labeled A-l, A-2, etc.; in appendix B, B-l, B-2, etc. If there is a methodology table 
for each summary/text table, it is helpful to use the appendix letter followed by a 
number suffix, where the number corresponds to the text table number. 


Wording of Table Titles 

Titles are catalogs of content and guides for ready reference. They should tell what, 
how classified, where, and when. For example: 


What: 

How 

classified: 

Where: 

When: 


Basic content and general limits of the group or subgroup that 
are shown in the table (e.g., enrollments in postsecondary 
institutions). 

How the universe data are classified and cross-classified (e.g., 
by control of institution, age and sex of student, geographic 
region, and state). 

Area or space segment, such as political division, geographic 
area, or other coverage designation if necessary for clarity (e.g., 
by country, by states, or, perhaps, geographic regions) 

Time reference (e.g., 2000; September 1999; academic year 
1998-99; various years, 1950-90, etc.) 


Thus, we might have: 

Table 1. Full-time-equivalent fall enrollment in postsecondary institutions, by control 
and age: By state, 1998 

Note the punctuation; a period and two “n” spaces between number and first word are 
used to separate the title from the table identifier. A comma is used before the “by” 
classification, with commas separating series of three or more components, including 
a comma before “and.” Finally, a colon is used before the “where” or “when” 
reference (use a comma between where and when if both are present). Note also that, 
besides proper nouns, the first word of the title and the first word after the colon 
begin with capital letters. 

For the “how classified” segment, a definite order should be used. Start with the 
data-column heads crossing left to right and top to bottom, then the stub. For 
example, the title above would fit a table set up like this: 


Total, in all 
institutions 

Statc^_^^JndciA0__CA£Ahi i 

U.S. 

Alabama 

Alaska 

Arizona 


In publicly controlled In privately controlled 
Under 30 Over 30 Under 30 Over 30 


172 





If the purpose is to emphasize one of these elements, rewording of the title might 
better reflect the content. For example, the element that sets this table apart from 
others in a series might be the control classification. Then the title could read: 
“Enrollments in publicly and privately controlled institutions of higher education, by 
age, sex, and state: 1998” leaving out “control” in the classification segment. 

The title must never promise more than the table contains, but the table may contain 
more. To avoid excessive wordiness, generalizations may be used, but table titles 
should be detailed and explicit enough to differentiate any one table from all others in 
a report. For example, if the number of items in the classification segment is lengthy 
and a subset of items are repeated across a series of tables, the table titles might read: 

Table 1. Fall enrollment in elementary and secondary schools, by free lunch 
eligibility and selected characteristics: 1999 

Table 2. Fall enrollment in elementary and secondary schools, by minority 
enrollment and selected characteristics: 1999 

The wording should be in topical form, not in sentence form. This means that verbs 
are omitted from titles, as are articles and other parts of speech that do not convey the 
basic “numbers of,” “percent of,” and “distributions of’ if the meaning and 
differentiation from other tables are clear without them. Carefully chosen headnotes 
and footnotes also may help shorten titles. Abbreviations are used sparingly, and then 
only those that are commonly accepted or otherwise identified, as in footnotes or text. 

Placement of Titles 

Start the first line of the title at the left margin and begin each subsequent line under 
the first word of the title. 

Table 1. The first line of the title extends the width of the table; the second and 
subsequent lines begin under the first word of the title 

Titles for Multi-Page Tables 

For each page after the first page of a multi-page table, repeat the table number and 
the full table title, with the word “ — Continued” added, as follows: 

Table 1. Total expenditures for public elementary and secondary education, by 
function and state: 1995-96 — Continued 

In the case of a double-page-spread table the word " — Continued" is added after the 
first pair of facing pages. 
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Headnotes 

The headnote — a general qualifying statement in brackets, centered under the title — 
should be used only when it applies to all or almost the entire table or clarifies the 
contents of the table by expanding or qualifying the title. The headnote ends without 
a period, even if the last statement is a complete sentence; but internal periods are 
used if required by sentence structure. (See Tabular Notes.) 
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BOXHEADS 


The boxhead consists of the stubhead, column heads, and spanner heads that classify, 
describe, or qualify the column or columns to which they refer. The heads are placed 
approximately in the center of areas defined by real or imaginary lines (boxes) 
directly above the vertical columns of information to which they apply. 

Parts of the Boxhead 

The column head is the basic unit of the boxhead, and each column should have one. 
It may or may not be qualified, supplemented, or described by one or more spanner 
heads above it. 

Spanner heads, or multicolumn heads, are placed above two or more subordinate 
column heads to clarify, describe, or shorten the subordinate heads (See also 
Spanners). A single spanner head may also span two or more subordinate spanner 
heads, as in this example: 


State or All 

other area students 


First- time students only 

Percent of 

Number total number 

Total Men Women Men Women 


In double-page-spread tables, spanners continue from the left-hand to the right-hand 
page of the pair, with “ — Continued” added following the repeated spanners on the 
right-hand page. 

A banner head, which is a special type of spanner head that is rarefy needed, extends 
over all columns except the stub. The best use of a banner head is as a “read-in” line 
that clarifies data in the columns in relation to the column heads. In the following 
example, the banner is appropriate to all data columns and identifies the data shown 
as different from what the single column heads indicate. 



Licensees and stations in: 

Type of 
licenses 

Aggregate 

United 

States 

North 

Atlantic 

Great 
Lakes 
and Plains 

South 

-east 

West & 
South- 
west 

Out- 

lying 

areas 


Wording and Punctuation in Column Head 

Column heads should read horizontally — almost never vertically. Wording is brief, 
as in other parts of the table, and requires careful phrasing. Horizontal space almost 
always can be saved by using multicolumn heads, by putting wide heads on more 
lines, by hyphenating words at the ends of lines, and by using standard, easily 
understood abbreviations where necessary. (See Breaking and Hyphenating Words.) 
To avoid an overly fonnal appearance, capitalize only the first letter of the first word 
in each head and the first letters of any proper nouns. 
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Sequence of Columns 

Total and subtotal columns are placed at the left of the columns that they aggregate, 
except in financial tables prepared specifically for accounting purposes, which require 
totals at the right. In NCES publications intended for broad readership, tables 
showing dollar amounts have totals on the left. Derived figures — such as averages, 
ratios, and percentages — usually are placed in columns to the right of the base 
figures. 

Spacing in the Column Head 

The illustrations following show minimum, normal, and maximum recommended 
spacing in the boxhead. 


Minimum 

vertical Normal 

spacing in vertical 

the boxhead spacing in 

the boxhead 


Maximum 
vertical 
spacing in 
the boxhead 


In these three examples, the column is approximately centered vertically in the area 
assigned. For minimum spacing, no bla nk space is left above or below this head; this 
spacing should only be used in cases where space is at a premium. 

Each column heading in the body of the table should be placed flush right over the 
column. Within each set of column headings, each column heading should end on the 
same line. (See Placing Figures in the Column.) 

In a ruled table, all of the column-heading boxes on the same level should be the 
same height, as determined by the column heading with the most typed lines. 


Wrong: Right: 


Research 

grants 

Formula and 
project 

Training 

grants 

Research 

Formula and 
project 

Training 

1968 

1969 

grants 

1968 

1969 

grants 

grants 

grants 

1968 

1969 

1968 

1969 

1968 

1969 

1968 

1969 


Units of Measurement in the Column Head 

Units of measurement (e.g., pounds, percent, dollars) often appear in the column 
head. When they do, they should be placed after or below the column-head captions 
that they modify. Sometimes a unit of measurement comprises an entire column 
head. 
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1 . If it modifies a caption, enclose it in parentheses and use all lowercase 
letters; for example: “Expenditures (millions of dollars)” and “Dollars 
awarded (in thousands).” Abbreviations, if used, should be clear; for 
example: “Floor area (1,000 sq. ft.)” or “Floor area (thous. sq. ft.).” 

2. If it comprises an entire column head, omit the parentheses and treat it like 
any other column heading. Capitalize, for example, “Billions of dollars” or 
“Percent of total” if it is the entire head. 

Column Numbers or Letters 

Occasionally, tables with many column headings need numbered or lettered columns 
for ease of reference. The numbers or letters appear just below the boxhead and run 
in sequence from left to right beginning with the stub. Column numbers or letters 
may be enclosed in parentheses or separated from the rest of the table by a horizontal 
ruling. 


Stub Column Column OR Stub Column Column 


caption 

head 

head 

(1) 

(2) 

(3) 

Total 

986 

461 

Item 

0 

73 

Item 

986 

388 


caption 

head 

head 

1 

2 

3 

Total 

986 

461 

Item 

0 

76 

Item 

986 

388 


Breaking and Hyphenating Words 

Most often in headings (but also in stubs), breaking and hyphenating words is 
necessary. The guide for breaking words and use of hyphens is the GPO Style 
Manual and its Word Division Supplement. Comments on some common pitfalls 
follow: 


Break words only between syllables: usually divide doubled consonants (e.g., 
syl-la-bles but en-roll-ments). 

Never break one-syllable words. 

Avoid breaking: 

Words that would leave a one-letter syllable on a line {not a-mendment). 
Words of four or five letters. 

Always hyphenate as follows: 

Full-time equivalents and full-time-equivalent number. 

Nondegree-credit students {but noncredit courses and activities). 
Nonscience-related curricula. 

Nonengineering-related technologies. 

First-professional degrees. 
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THE TABLE STUB 


The stub consists of a heading and the line captions that are listed at the left side of a 
table and describe each row of figures in the field. Capitalize only the first letter of 
the first word and the first letters of any proper nouns in both the stub heading and the 
line captions. Always provide a stub heading that describes, defines, or amplifies the 
stub captions. Use a word such as “Item” or “Characteristic” for a collection of stub 
entries that defy brief classification: 


Characteristic 


First column 
heading 


Total $980,000 

Subtotal 425,000 

When the stub is too long for one page and must be continued on another page, the 
continuation should also be placed at the left side of the second page. 

For a double-page spread, the stub in this sample survey example should be repeated 
on the right side of the right-hand page. Line numbers may be substituted for the 
right-hand stub if space is tight. (See Double-Page-Spread Tables.) 

Left-hand page Right-hand page with stub 


Characteristic 


First column 
heading 


Last column 
heading 


Characteristic 


Total $990,000 $1,460,000 


Total 


Subtotal 


425,000 678,000 Subtotal 


Organization of the Stub 

Place grand totals at the top of the column stub. Then the items in a stub should be 
displayed in a logical sequence. Some typical categories are alphabetical, 
geographical, chronological, numerical, quantitative (by size), customary (commonly 
accepted order), progressive (order of growth or development), and importance. 
Sometimes the arrangement of items in a stub of a single table may fall into two or 
more categories. For example, the main order might be geographic (which could be 
customary also) with the states listed alphabetically, sometimes listed under each 
geographic region. 

Convention requires year entries showing trends to run sequentially from earliest to 
latest. Stub entries consisting entirely of years are centered in the area allotted to the 
stub. 
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Indentation in the Stub 

When there are multiple levels of subordination to be displayed within a table, 
indentation of the stub can provide a road map to help readers follow the flow of a 
table. Indentation can best be accomplished by setting tabs at a space equivalent to a 
specified number of the letter “n.” 

• Grand totals — If there is only one other level, indent three “n” spaces (i.e., 
start in the fourth space). Indent five spaces if there are two or more levels of 
subordination. 

• Major group or subtotal captions — Start the caption line at the left edge of 
the table. Indent any continuation lines three “n” spaces. 

• Subordinate captions — Tab two additional “n” spaces for each subsequent 
level of subordination (e.g., two “n” spaces for the third level group and four 
“n” spaces for the fourth level). Indent any continuation lines three “n” spaces, 

For example: 


Stub head 
(centered 
vertically and 
flush left) 


Spanner 

head 


Col. 

head. 


Total 

1,625 

Major group 

860 

Minor group 

514 

Item 

101 

Item 

98 

Item 

193 

Item 

32 

Item 

47 

Item 

43 


Vertical Spacing in the Stub 

Normal vertical spacing in the stub leaves a blank line between the total and the first 
group caption, between group captions, and between a subordinate series and 
following superior group caption. (See Spacing in the Column Head and Sizing a 
Table.) 
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When available vertical space is tight, reducing the height of bla nk lines can 
minimize normal spacing. The absolute in minimum spacing allows removal of all 
blank lines between stub captions, and then bolding total and all major group 
(subtotal) captions. 

Subordinate items under a group caption are usually single-spaced except when there 
is a long list of such items. Then it is best, if space permits, to group them by three, 
four, five, or more items with a blank “reader” line between groups. 

Boldface Type 

When vertical spacing is tight, boldface type, instead of line spacing, may be used to 
set off group captions in the stub. The separation is indicated by bolding the group 
captions. Boldface type may also be used to make totals and subtotals stand out. But 
first, the table should be examined carefully to detennine whether appropriate spacing 
and indention of the stub captions without using boldface type could achieve the same 
result. 

Wording and Punctuation in the Stub 

Stub captions should be as brief as possible without losing precision and clarity. If 
space is limited, abbreviations are used only when they can be understood instantly. 
Minimum punctuation is used to make the meaning clear. Periods are omitted at the 
ends of stub captions and may also be omitted after abbreviations to save an 
additional space in very tight stubs, if the meaning is clear. 

Leaders 

Leaders are rows of periods connecting the last word of a stub caption (last line of an 
overrun) with the first data column. If used in tables with no vertical rulings, two or 
three spaces should separate the leaders from the longest number in the first data 
column. Use leaders only when a wide space divides the stub caption and the first 
column of data in the body of the table. 

Leaders are always omitted after stub captions without entries opposite them in the 
field. They are almost always omitted in the duplicate stub at the far side of the right- 
hand page of a double-page spread. 

Line Numbers 

The main use for line numbers is for convenience of reference or to alleviate a tight 
stub situation in the right-hand side of a double -page-spread table. When they are 
used, all stub captions that identify entries in the field should be numbered 
consecutively. The line numbers are lined up as a column, two spaces to the left of 
the stub entry positioned farthest left and are placed opposite the last line of an 
overrun caption. 

In a double-page-spread table the line numbers should be repeated on the right-hand 
page, as the last column two spaces to the right of the longest line of the duplicate 
stub. If space is very limited, use line numbers only (omitting the duplicate stub), 
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matching them with the line numbers of the stub on the left-hand page. The 
illustrations below use sample survey data to show how to use line numbers in a 
normal double-page-spread table and in one in which space is limited. 


Left-hand page 


Right-hand page with stub Right-hand page 

without stub 


Characteristic 

First 

column 

heading 

Last column 
heading 

Characteristic 

Last column 
heading 


1 

Total 

$980,000 

$1,460,000 

Total 

1 

$1,460,000 

1 

2 

Subtotal 

425,000 

678,000 

Subtotal 

2 

678,000 

2 

3 

Item 

98,000 

229,000 

Item 

3 

229,000 

3 

4 

Item 

135,000 

65,000 

Item 

4 

65,000 

4 


Item 



Item 





with 



with 




5 

overrun 

8,000 

187,000 

overrun 

5 

187,000 

5 


Continuations 

When a category with subcategory listings breaks over to another page, all superior 
categories should be repeated, with the word “ — Continued.” 


For example: 

Foreign languages 

French 

Spanish 

Other foreign languages . 

Social sciences 

U.S. history 

World history 

Other history: 

Ancient history. . . 

Oriental history. . . 


Social sciences — Continued 
Other history — Continued 
Bible history .... 
Local history .... 

Geography 

Government studies . . . 

Current events 

Other 
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THE BODY 


Body, or field, is the part of a table that contains the numerical data — below the 
column heads and to the right of the stub. It consists of cells, rows, and columns. A 
cell is the space occupied by one entry in the field. A row is a horizontal array of 
cells opposite a stub caption. A column is a vertical array of cells under a column 
heading. 

Units of Measurement in the Body 

Units of measurement usually do not appear in the body. The preferred places for 
units of measurement are in a headnote, if they apply to all or nearly all of the table 
(see Headnotes), or in the boxhead, if they vary by column (see Units of 
Measurement in the Column Head). 

Spanners 

Spanners are multicolumn headings that cross the table within the field instead of in 
the boxhead. In the summary table below, the column heads at the top of the table 
apply to all levels in the field. The field spanner is most useful when emphasis on a 
change of category is needed and the label applies directly to the data in the field. 
They should not be used when they apply to the stub-entry classification. 


Table 1. Number and percentage distribution of families, by family status and 

presence of own children under 18: Current Population Survey, 1970 to 
1998 


Family status 

1970 

1980 

1998 

Change, 
1970 to 
1980 

Change, 
1980 to 
1998 


In thousands 


Percent change 

All families 

51,456 

59,550 

70,880 

15.7 

19.0 

Married-couple family 

44,728 

49,112 

54,317 

9.8 

10.6 

No own children under 18 

19,196 

24,151 

29,048 

25.8 

20.3 

With own children under 1 8 

25,532 

24,961 

25,269 

-2.2 

1.2 

Other family, male householder 
No own children under 18 

With own children under 1 8 

1,228 

1,733 

3,911 

41.1 

125.7 


Other family, female 
householder 



Change in percentage 

Percent of all families 

points 


All families 

100.0 

100.0 

100.0 

100.0 

100.0 

Married-couple family 

86.9 

82.5 

76.6 

-4.5 

-5.8 

No own children under 18 

37.3 

40.6 

41.0 

3.3 

0.4 

With own children under 1 8 

49.6 

41.9 

35.7 

-7.7 

-6.3 
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Field spanners sometimes are used to reduce the length and increase the width of very 
narrow and long tables. They also may be used for placing long major group captions 
in the field when there is not enough room for them in the stub. These advantages are 
offset to some extent by their unfavorable location in the field where they break 
across the columns and separate the figures from the descriptive column headings. 

Decimals, Zeros, and Dollar and Percent Signs 

In a column of figures containing decimal fractions, figures of less than 1 have a zero 
(0) to the left of the decimal point. However, do not use a zero before a decimal 
fraction when the number cannot be greater than 1 (e.g., levels of statistical 
significance, proportions, or correlations). If there are whole numbers (numbers 
without decimal fractions) in the column, they are recorded with a decimal and zero 
to the right of the decimal point. All figures in a table that are reported in the same 
unit of measurement should report data to the same decimal value. If the column 
consists entirely of whole numbers, do not use decimal points and zeros. The 
recorded number of decimal places should offer no greater degree of precision than is 
warranted by the data (see Standard 5-3, NCES Statistical Standards, 2002). 

As shown below, the only exception to these rules is that in the case of a universe 
survey an absolute zero (0) is always expressed as a single zero without a decimal 
point; in a column of decimal fractions, it is positioned as shown. 


TABLE A 


TABLE B 


TABLE C 


Item A 

0 

Item A 

0 

Item A 

0 

Item B 

0.7 

Item B 

0.72 

Item B 

1 

Item C 

4.0 

Item C 

4.00 

Item C 

4 

Item D 

18.6 

Item D 

18.64 

Item D 

19 

Item E 

# 

Item E 

# 

Item E 

# 

# Rounds to zero. 


# Rounds to zero. 


# Rounds to zero. 



When all of the figures in a column pertain to money, the first figure in the column 
should be preceded by a dollar sign ($), even though the column heading or a 
headnote indicates the unit of measurement (e.g., millions of dollars). 

A percent sign (%) should not follow figures in the field. If all are percentages, the 
fact may be indicated in a headnote: if some columns or lines are percents, indicate in 
a spanner, individual column heads, stub entry, or title, as appropriate (e.g., “in 
percent”). The word “percent” instead of “percentage” is preferred in this context; 
the symbol (%) should be used only if there is no room to spell it out. 

Placing Figures in the Columns 

Allow a minimum of one space on each side of an entry. Entries should be aligned at 
the right-hand side — including absolute zero in number columns. For two-line stub 
captions, entries are placed opposite the second line. Leave no cell empty; if a 
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number is not available, insert the appropriate explanatory special symbol in the cell. 
(See list in Special Notes.) 


Arranging Figures for Ease of Comparison 

The closer numbers are to each other, the easier it is to compare them. Vertical 
comparisons usually can be made more rapidly than horizontal comparisons. In the 
following example of universe data, arrangements A and B both are satisfactory, but 
the vertical listing in A is more effective because it is much easier to locate the largest 
and smallest numbers and to determine differences in the general sizes of the 
numbers. 

A B 

102,007,666 102,007,666 1,998,464,732 99,428,531 941,325 23,918 


1,998,464,732 


99,428,531 

941,325 

23,918 


(NOTE: The vertical arrangement brings the figures closer 
together and requires less movement of the eyes.) 


The following tabulations show identical universe data, but the vertical comparison in 
B emphasizes the within item comparisons over time. 


Table A Table B 


Item 

FY 

1964 

FY 

1965 

FY 

1966 

FY 

1867 

Fiscal 

Year 

Item 

A 

Item 

B 

Item 

C 

Item 

D 

A 

1,192 

6,195 

8,628 

7,107 

1964 

1,192 

647 

92 

5,430 

B 

647 

502 

111 

835 

1965 

6,195 

502 

86 

1,999 

C 

92 

86 

75 

42 

1966 

8,629 

111 

75 

3,671 

D 

5,430 

1,999 

3,671 

4,442 

1967 

7,107 

835 

42 

4,442 

E 

775 

215 

303 

629 

1968 

2,888 

229 

34 

1,041 


In any table, the comparisons that are the most important should be placed as close 
together as possible for maximum emphasis. 
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TABULAR NOTES 


Tabular notes contain supplementary infonnation necessary for a correct 
understanding of the table or a part of it. They fit into two categories: (1) headnotes 
at the top of the table are used only occasionally, and (2) footnotes at the bottom of 
the table are used often. Footnotes include general notes, reference notes, and source 
notes. 

Tabular notes should be kept as brief as possible without sacrificing clarity. Topical 
style is used, with subject-noun, verb, articles, and other parts of speech omitted if not 
essential to understanding. 

Headnotes 

A headnote is a special explanation that should be seen before the rest of the table is 
read. The headnote should be used only when it applies to all or almost all of the 
cells in the body of the table or if it clarifies the contents of the table by expanding or 
qualifying the title. Sometimes, careful wording of title and column heads can 
eliminate the need for headnotes. Consider, instead of the headnote, a general note 

(NOTE: Data are ), or a reference footnote with the symbol attached to column 

heads or stub. Reference notes attached to the title should be avoided, if possible. 

A headnote should be centered above the boxhead; if two lines are needed, the second 
should be centered under the first. It should be enclosed in brackets and typed in 
lowercase letters, except for the first letter of the first word and the first letters of 
proper nouns and adjectives. No period is placed after the last word; if more than one 
sentence, a period ends all but the last sentence. The following are typical examples 
of headnotes. 


[Based on a 10-percent sample of applications] 

[Includes both public and private] 

[Millions of dollars] 

Sometimes a headnote may indicate a unit of measurement that applies to some, but 
not all, of the columns of figures: 

[Dollar amounts in thousands] 

Normally, one bla nk line separates the headnotes from the table title; but more room 
may be left, if necessary, to make the table fit the available space. Two bla nk lines 
usually separate the headnote from the top line of the boxhead. 

Special notes 

Special notes are notes that are standard for cells in the body of tables and usually 
refer to a statistical property of the specific cell (e.g., not applicable, missing, an 
unstable estimate, statistically significant). Special notes fill cells in the body of 
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tables, and do not require parentheses. When special notes are used, they should 
always be listed in the following order. The following list summarizes a number of 
statistical special notes and related set of symbols that should be used consistently 
across all NCES reports. If necessary, additional explanatory notes may be added to 
the end of relevant notes. 

Symbol Label 

— Not available 

f Not applicable 

# Rounds to zero 

! Interpret data with caution 

$ Reporting standards not met 

* p<0.05 

Footnotes 

General, reference, and source notes fall at the bottom or “foot” of the table. General 
notes refer to all or much of the table; reference notes, to specifically designated 
portions; and source notes identify sources of the data. All end with a period. 

General notes 

General notes, like the headnotes, qualify, describe, or explain whole tables or easily 
identifiable parts of them. The choice between a general note and a headnote is 
guided by the degree of emphasis required, and the length and detail included in the 
note. 

The general note is introduced with the word “NOTE” followed by a colon. For 
example: 


Meaning 

Data were not collected or not reported 

Category does not exist 

The estimate rounds to zero 

Estimates are unstable 

Did not meet reporting standards 

Significance level 


NOTE: Detail may not sum to totals because of rounding. 

Reference notes 

Reference notes refer to specifically designated portions of the table. By “keying” the 
note to the material to be qualified, reference notes can be kept brief. 


Classroom teachers 
States 


Full-time Part-time 


'Full-time equivalent of full-time and part-time. 


FTE 1 


The positioning of symbols for reference notes in tables follows definite principles. 
The symbols are placed at the right of the word the note applies to, in both headings 
and stubs. They are placed at the right of data in the field of a table; and if a 
numbered footnote stands alone in a cell, it is enclosed in parentheses: (1). 
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Footnotes are numbered sequentially throughout a single table, but a recurrent 
reference repeats the symbol. Footnotes follow a logical order, generally line for line 
from left to right and down. 

The placement of footnote symbols within a table and the arrangement of notes at 
the end of the table are illustrated in the following table. Footnotes are placed at the 
end of the table. Special notes are listed first, followed by reference footnotes, 
general notes, and then the source. 

Table 1. Families, by family status and presence of own children under 18: Current 
Population Survey, 1970 to 1998 


Family status 

1970 

1980 

1998 

All families 


59,550* 

70,880 


— Special symbols are listed first. 

'Numerical footnotes follow. 

NOTE: The general note comes next. 

SOURCE: The source comes last. 

Source Notes 

The source note indicates the specific source of the statistic. In general, the source 
note refers the user to the original (or primary) source and gives credit to the 
originating report, or in the case of new tabulations, the data file. 

The source note should cite the report, relevant survey(s) or subsurvey(s), data 
reference year, file version number, department name, and agency name. In the case 
of unpublished data, use the month and year of the tabulation or data file. If the data 
are drawn from multiple years: for 1 to 3 years, report each year; for more than 3 
continuous years, use the year span; and for more than 3 noncontinuous years, use 
“selected years” and the year span. 

Following are some typical examples: 

Data from one or more reports: 

Revenues and Expenditures for National Public Elemen tary and Secondary 
Education: School Year 1997-98, Common Core of Data (CCD), “National Public 
Education Financial Survey” (NPEFS), 1997-98, Version 1, U.S. Department of 
Education, National Center for Education Statistics. 

Data from unpublished tabulations and a published NCES report: 

SOURCE: U.S. Department of Commerce, Bureau of the Census, Current Population 
Survey, previously unpublished tabulation (April 1998); and U.S. Department of Education, 
National Center for Education Statistics, Dropout Rates in the United States. Selected years, 
1972-97. 
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SIZING A TABLE 


Most NCES publications are printed on paper that is 8 1/2 x 1 1 inches. The “image” 
size (area occupied by printed matter) is expected to be about 6 1/2x9 1/2 inches, 
including space for the page number. 

It is well to note that, although this section focuses on ways to reduce dimensions, do 
so within reason. The problems of table layout usually are those of too much rather 
than too little, and too much vacant space within a table is no less a fault than others. 

Some ways to improve the appearance and reduce one or both dimensions include 
pruning, internal revision, spacing reduction, and font reduction, now discussed in 
that order. (See Spacing in the Column Head and Vertical Spacing in the Stub.) 


Pruning 

Trimming a table to alter its shape aims to prune its outline to the desired proportion. 
Of course, internal symmetry also is desirable within reason — such as relatively even 
spacing among the structural elements of the column heads, data columns, and stub 
captions. Here are some suggestions. 

To reduce the width of a table, try — 

1 . Typing wide column headings or stub captions on several lines, dividing 
words if necessary. 

2. Using spanner (multicolumn) headings over related column headings to 
avoid the repetition of duplicating words. 

3. Paring unnecessary words in or abbreviating the stub captions and column 
headings. (See Wording and Punctuation in the Column Head and 
Wording and Punctuation in the Stub.) 

4. Rounding columns of figures. 

To reduce the length of a table, try — 

1 . Typing column headings or stub captions on fewer lines by abbreviating or 
by placing more words on each line. 

2. Removing the blank lines in the column headings or stub captions. (See 
Spacing in the Column Head and Vertical Spacing in the Stub.) 

4. Omitting a bla nk line above or below headnotes (See Headnotes.) 

5. Omitting the bla nk line below the first footnote or placing two or more 
footnotes on one line. 

6. Examining the stub to eliminate unnecessary nondata captions. 

7. Paring unnecessary words in column heads and stub captions by using 
spanner headings. 
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Internal Revision 

Sometimes an odd-shaped table can be tailored to fit a single page by revising its 
internal structure. For example, if the table is very wide and short, the table may be 
“turned,” by reversing the functions and positions of the stub and the boxhead. (See 
Arranging Figures for Ease of Comparison.) Or, the boxhead may be divided into 
two levels, repeating the stub as below: 


Fiscal Year 

Dollars 

Number (millions) 

Dollars 

Number (millions) 


Research grants 

Training grants 

1970 

1980 

2000 




Formula grants 

Project grants 

1970 

1980 

2000 




Conversely, if the table is narrow and much too long for the page, using a double-up 
table format may shorten it. In less drastic situations, some data columns or data lines 
may be eliminated by incorporating low-yield categories, or more of them, in an 
“other” (residual) category or by eliminating categories entirely if they yield no data. 

Spacing Reduction 

Blank lines can be variably sized. By reducing the vertical spacing from a full line to 
three-quarters or one-half a line, the size of a table may be reduced and the number of 
printed pages may be reduced. 

Font Reduction 

Smaller fonts can be used to reduce tables that are too long and/or too wide. It is often 
desirable to reduce statistical tables for other reasons also. With exceptionally long 
tables, the number of printed pages may be long. By using a smaller font, the number 
of printed pages may be substantially cut, thus making the publication easier to use as 
well as lowering the printing, storage, and mailing costs. And many tables are easier 
to read a slightly smaller size. Note, however, that a minimum practical font size is 9. 
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APPENDIX 

HOW TO PRODUCE TABLES AT NCES 

(This is a quick guide; for more details, see the 2002 edition of 
NCES Guidelines for Tabular Presentations) 


GENERAL 

1) An NCES report may contain as many as three different types of tables. 

• Summary/text tables — range in size from a few lines to one or two pages and require titles and 
table numbers. You can place each summary table at or near the first text reference to the 
table or group them at the ends of chapters or at the end of the report in the order mentioned in 
the text. Summary tables precede reference tables if there are any. 

• Reference tables — detailed tables containing large quantities of data. They usually form a 
separate section at the end of the text or in an appendix (in the order mentioned in the text). 
When these tables include standard errors along with the data they should be placed in an 
appendix. 

• Methodological tables — contain relevant statistics for the data in a report; for example, sample 
sizes, coefficients of variation, or standard errors. Place these tables in an appendix and 
follow the order the tables are presented in the report. 

2) If you disperse tables throughout the text, refer to each of them in the narrative, and refer to 

them sequentially (i.e., the tables should appear in the order mentioned in the text). 


TABLE TITLES 

1) Start out with the topic of the table, followed by a comma and then the “by” list. 

2) In the “by” list, items in the columns are listed first, followed by the items in the rows. 

3) End the title with a colon followed by the data year(s). 

4) Capitalize only the first word, proper nouns, and the word following the colon. 

5) Avoid footnoting a title; use a general note (i.e., NOTE) instead. 

6) Year spans — use 1988-97 or 1988 through 1997 for a span of calendar years; 1988 and 1987 for 
two distinct years. Use fiscal years 1989-98 or fiscal years 1989 through 1998 for a span of 
fiscal years. And use academic year 1988-89 for one school year or academic years 1988-89 
through 1991-92 for a span of school years. Use en dashes instead of hyphens between years. 
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ORGANIZATION OF SIDE STUBS 


1) Place the stub header flush left. 

2) Report grand totals in the first row of the table. 

• If the table has only two levels, that is, the grand total and one disaggregation, tab over three 
spaces the size of the letter “n” to start the table grand total. 

• If the table has three or more levels, that is the grand total and at least two disaggregations, tab 
over five “n” spaces to start the grand total. 

3) Start the label for the first level of disaggregation (that is, the major group or subtotal) at the left 
margin of the table. 

4) Tab over two “n” spaces to start the second level of disaggregation. 

5) Tab over four “n” spaces for the label for a third level of disaggregation (continue this pattern 
for additional levels of disaggregation). 

6) If a row label needs a footnote, place it to the right of the label. 

7) If the rows are school years, use “School year ending” as the stub and then use the single years 
across from such a stub. 

8) Use full state names in table stubs. 

9) Use an en dash to designate “through” when referring to age. 

HEADERS 

1) Place the side stub head flush left. 

2) Column spanners should be centered over the set of columns they describe. 

3) Place each column head flush right. 

4) If a column header needs a footnote, place it to the right of the header. 

5) If the columns are school years, use “School year ending” as a spanner head and then use the 
single years under such a head. 

6) Use an en dash to designate “through” when referring to age span. 
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BODY OF THE TABLE 


1) Do not mix different measurements of data in the same column (e.g., percents and counts). 

2) In tables displaying dollar amounts over time, indicate whether the amounts are current or 
constant dollars and include the base year (e.g., in constant/current 1997 dollars). Place dollar 
signs only in the first row. 

3) If the rows and columns in a table may not add to the totals presented, add a general note, 

“Detail may not sum to totals because of rounding.” 

4) LEAVE NO TABLE CELLS BLANK 

• Use a f if data for the cell are not applicable. (Do not use NA). Use — for not available 
(i.e., not reported) and # for rounds to zero. These symbols fill the cell and do not require 
parentheses. See Attachment for a list of symbols to use with bla nk cells. 

• If a cell is blank for a reason not covered by a special symbol, footnote the cell with the 
footnote number in parentheses flush right in the column. 

5) If a number in a cell needs a footnote, place the footnote to the right of the number; and if a 
numerical footnote stands alone in a cell, it is enclosed in parentheses: (1). 

6) In order to place a zero in a cell, the measure must actually be zero based on universe data. (It is 
preferable to report it 0, not 0.0.) 

7) Use a line of periods (leaders) only when a wide space divides the stub and the first column. 

8) In text and summary tables, round percentages to no more than one decimal place, round four- 
and five-digit numbers to hundreds, and round six-digit numbers and over to thousands. 

9) In reference and methodology tables, round percentages to no more than two decimal places, 
except in certain methodological tables, where a finer breakdown may be necessary. Standard 
errors should be reported to one decimal place more than the related estimate. 


BOTTOM OF TABLE 

1) Footnotes — bring all lines of a footnote flush left. 

2) Symbol footnotes precede numbered ones at the bottom of the table. 

3) Place numbered footnotes next. 

4) The general note comes next; bring all lines for notes flush left. There may be more than one 
note, but they are all reported in one NOTE: section. 
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5) The last entry at the bottom of the table is the SOURCE: Department name, agency name, major 
survey or publication title, subsurvey title (in quotes), and year of survey or publication. 

6) For unpublished data, use the month and year of the tabulation or tape file. 

7) For up to 3 years of data, state each year. For more than 3 continuous years, give the year span. 
For more than 3 noncontinuous years, use “selected years” and the year span. 

8) Use a semicolon to separate sources from the same agency. Use a period to separate agencies. 
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Attachment 


The following list summarizes a number of statistical special notes and related set of symbols that 
should be used consistently across all NCES reports. If necessary, additional explanatory notes may 
be added to the end of relevant notes. 


t 

# 

j 

$ 


Not available Data were not collected or not reported 

Not applicable Category does not exist 

Rounds to zero The estimate rounds to zero 

Interpret data with caution Estimates are unstable 

Reporting standards not met Did not meet reporting standards 
p<0.05 Significance level 


195 



196 



APPENDIX D: Survey Titles 


1. COMMON CORE OF DATA (CCD) 

The NCES Common Core of Data (CCD), “[component name],” [four-digit beginning year]- 
[ two-digit ending year]. 

Component names: 

Public Elementary/Secondary School Universe Survey 
Local Education Agency Universe Survey 

State Nonfiscal Survey of Public Elementary/Secondary Education 
Early Estimates of Public Elementary/Secondary Education Survey 
National Public Education Financial Survey 
School District Finance Survey (Form F-33) 

Years — All components are annual from 1987-88 to at least 2004-05. 

Example: 

The NCES Common Core of Data (CCD), “Public Elementary/Secondary School Universe 
Survey,” 1987-88. 


2. SCHOOLS AND STAFFING SURVEY (SASS) 

Schools and Staffing Survey (SASS), “[component name], ” [four-digit beginning year] -[two- 
digit ending year]. 

Component names: 

School District Questionnaire 

Public School Questionnaire 

Private School Questionnaire 

BIA School Questionnaire 

Charter School Questionnaire 

Public School Principal Questionnaire 

Private School Principal Questionnaire 

BIA School Principal Questionnaire 

Charter School Principal Questionnaire 

Public Teacher Questionnaire 

Private Teacher Questionnaire 

BIA Teacher Questionnaire 

Charter Teacher Questionnaire 

Public Library Media Center Questionnaire 

Private Library Media Center Questionnaire 

BIA Library Media Center Questionnaire 

Years — 1987-88, 1990-91, 1993-94, 1999-2000, 2003-04, continuing on a 4-year cycle. 
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3. TEACHER FOLLOW-UP SURVEY (TFS) 

Teacher Follow-up Survey (TFS), [four-digit beginning year] -[two-digit ending year]. 

Component names: 

Questionnaire for Former Teachers 
Questionnaire for Current Teachers 

Years — 1991-92, 1994-95, 2000-01, 2004-05, continuing on a 4-year cycle. 


4. PRIVATE SCHOOL UNIVERSE SURVEY (PSS) 

Private School Universe Survey (PSS), [four-digit beginning year]-[two-digit ending year 
until 1997-98, after that four-digit ending year]. 

Years— 1989-90, 1991-92, 1993-94, 1995-96, 1997-98, 1999-2000, 2001-2002, 2003-2004. 

5. PRIVATE SCHOOL SURVEY EARLY ESTIMATES 

Private School Survey Early Estimates, [four-digit beginning year]-[two-digit ending year] 

Years— 1989-90, 1990-91, 1991-92, and 1992-93. 


6. NATIONAL HOUSEHOLD EDUCATION SURVEYS PROGRAM (NHES) 

The [component name] Survey of the National Household Education Surveys Program 
([component acronymJ-NHES: [four-digit year]). 

Component name; component acronym (or repeat of component name); and year(s): 

Early Childhood Education; ECE; 1991 

Adult Education; AE; 1991, 1995, 1999 

School Readiness; SR; 1993 

School Safety and Discipline; SS&D; 1993 

Early Childhood Program Participation; ECPP; 1995, 2001 

Parent and Family Involvement in Education/Civic Involvement; PFI/CI; 1996 

Youth Civic Involvement; YCI; 1996 

Adult Civic Involvement; ACI; 1996 

Household and Library Use; HHL; 1996 

Parent; Parent; 1999 

Youth; Youth; 1999 

Adult Education and Lifelong Learning; AELL; 2001 
Before- and After-School Programs and Activities; ASPA; 2001 
Parent and Family Involvement in Education; PFI; 2003 
Adult Education for Work-Related Reasons; AEWR; 2003 

Example: 

SOURCE: U.S. Department of Education, National Center for Education Statistics, Adult 
Education Survey of the National Household Education Surveys Program (AE-NHES: 1999). 
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7. FAST RESPONSE SURVEY SYSTEM (FRSS) 

Fast Response Survey System (FRSS), “[ survey title], ” FRSS [no.], [four-digit year]. 

Examples: 

The NCES Fast Response Survey System (FRSS), “Survey on Advanced Telecommunications 
in U.S. Private Schools: 1998-99,” FRSS 68, 1999. 

The NCES Fast Response Survey System (FRSS), “Survey on Internet Access in U.S. Public 
Schools, Fall 1998,” FRSS 69, 1998. 

8. INTEGRATED POSTSECONDARY EDUCATION DATA SYSTEM (IPEDS) 

[four-digit year] Integrated Postsecondary Education Data System, “[ component name]” 
(IPEDS-[component acronym]: [EITHER two-digit year OR two-digit beginning year-two- 
digit ending year]) 

Component Names, Acronyms, and Years: 

Graduation Rate Survey (IPEDS-GRS: [two-digit year]), 1997, 1998, 1999. 

Fall Enrollment Survey (IPEDS-EF: [two-digit year]), 1987 through 1999. 

Institutional Characteristics Survey (IPEDS-IC: [two-digit beginning year]-[two-digit ending 
year]), 1987-1988 through 1999-2000. 

Completions Survey (IPEDS-C: [two-digit beginning year]-[two-digit ending year]), 1986- 
1987 through 1998-1999. 

Salaries, Tenure, and Fringe Benefits of Full-Time Instructional Faculty Survey (IPEDS- 
SA:[two-digit beginning year]-[two-digit ending year]), 1988-1989 through 1999-2000. 
Fall Staff Survey (IPEDS-S: [two-digit year]), 1993, 1995, 1997, and 1999. 

Finance Survey (IPEDS-F:FY[two-digit year]), FY 1987 through FY 1999. 

Consolidated Survey (IPEDS-CN:FY[two-digit year]), FY 1990 through FY 1999. 

Academic Libraries Survey (IPEDS-L: [two-digit year]), 1988, 1990, 1992, 1994, 1996, 
1998. 

Integrated Postsecondary Education Data System (IPEDS), Fall 2000 
Integrated Postsecondary Education Data System (IPEDS), Spring 2001 
Integrated Postsecondary Education Data System (IPEDS), Winter 2001-02 
Years — 2000 and beyond (survey not broken into subject matter components) 


9. NATIONAL POSTSECONDARY STUDENT AID STUDY (NPSAS) 

[four-digit beginning year] -[two-digit ending year, except for 2000] National Postsecondary 
Student Aid Study (NPSAS: [two-digit ending year, except for 2000]) 

Years— 1986-87, 1989-90, 1992-93, 1995-96, 1999-2000 

Example: 

1992-93 National Postsecondary Student Aid Study (NPSAS:93). 
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10. NATIONAL STUDY OF POSTSECONDARY FACULTY (NSOPF) 

[four-digit year] National Study of Postsecondary Faculty (NSOPF .‘[two-digit year]) 

Years — 1988, 1993, and 1999 

Example: 

1993 National Study of Postsecondary Faculty (NSOPF:93). 

Note regarding NPSAS and NSOPF: Starting in 2004, NPSAS and NSOPF will be collected 
under the overall 2004 National Study of Faculty and Students (NsoFaS:04). The standard 
NPSAS and NSOPF references, however, will continue to be used in nearly all cases. 
Generally, the only cases in which NsoFaS will be referred to will be in methodological, 
technical, or introductory text. 


11. BEGINNING POSTSECONDARY STUDENTS LONGITUDINAL STUDY (BPS) 

[four-digit year/two-digit year] Beginning Postsecondary Students Longitudinal Study 
(BPS .‘[two-digit beginning year-two- or four-digit ending year]) 

1990/92 Beginning Postsecondary Students Longitudinal Study (BPS:90/92) 
1990/94 Beginning Postsecondary Students Longitudinal Study (BPS:90/94) 
1996/98 Beginning Postsecondary Students Longitudinal Study (BPS:96/98) 
1996/01 Beginning Postsecondary Students Longitudinal Study (BPS:96/01) 


12. BACCALAUREATE AND BEYOND LONGITUDINAL STUDY (B&B) 

[four-digit year/two-digit year] Baccalaureate and Beyond Longitudinal Study (B&B: [two- 
digit beginning year/two- or four-digit ending year]) 

1993/94 Baccalaureate and Beyond Longitudinal Study (B&B:93/94) 

1993/97 Baccalaureate and Beyond Longitudinal Study (B&B:93/97) 

1993/03 Baccalaureate and Beyond Longitudinal Study (B&B:93/03) 

2000/01 Baccalaureate and Beyond Longitudinal Study (B&B:2000/01) 


13. NATIONAL EDUCATION LONGITUDINAL STUDY OF 1988 (NELS:88) 

The National Education Longitudinal Study of 1988 (NELS:88/[two-digit survey year, with 
the exception of the Base Year and four digit for 2000), ''[study wave], [component name], 
[four-digit survey year] , [Data Analysis System (if applicable)]. " 

Waves and Components: 

The National Education Longitudinal Study of 1988 (NELS:88), "Base Year, [component], 
1988." 

Student Survey 
School Survey 
Parent Survey 
Teacher Survey 
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The National Education Longitudinal Study of 1988 (NELS:88/90), "First Follow-up, 
[component], 1990." 

Student Survey 
Dropout Survey 
School Survey 
Teacher Survey 

The National Education Longitudinal Study of 1988 (NELS:88/92), "Second Follow-up, 
[component], 1992." 

Student Survey 
Dropout Survey 
Parent Survey 
Teacher Survey 
School Survey 
Transcript Survey 

The National Education Longitudinal Study of 1988 (NELS:88/94), "Third Follow-up, 

1994." 

No component surveys 

The National Education Longitudinal Study of 1988 (NELS:88/2000), "Fourth Follow-up, 
[component], 2000." 

Postsecondary Education Transcript Study (PETS) 

The National Education Longitudinal Study of 1988 (NELS:88), "High School Effectiveness 
Study, 1990-92." 

Examples: 

The National Education Longitudinal Study of 1988 (NELS:88), "Base Year, Parent Survey, 
1988." 

The National Education Longitudinal Study of 1988 (NELS:88/90), "First Follow-up, Dropout 
Survey, 1990." 

The National Education Longitudinal Study of 1988 (NELS:88/2000), "Fourth Follow-up, 2000, 
Data Analysis System." 


14. HIGH SCHOOL AND BEYOND LONGITUDINAL STUDY (HS&B) 

High School and Beyond Longitudinal Study (HS &B- [class] : [two-digit beginning year] /[two- 
digit collection year]) 

High School and Beyond Longitudinal Study of 1980 Sophomores (HS&B-So:80/92) 

High School and Beyond Longitudinal Study of 1980 Seniors (HS&B-Sr:80/86) 

High School and Beyond Longitudinal Study of 1980 Sophomores, “High School Transcript 
Study” (HS&B-So:80/82) 

High School and Beyond Longitudinal Study of 1980 Sophomores, “Postsecondary 
Education Transcript Study” (HS&B-So:PETS) 

High School and Beyond Longitudinal Study of 1980 Seniors, “Postsecondary Education 
Transcript Study” (HS&B-Sr:PETS) 
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15. NATIONAL LONGITUDINAL STUDY OF THE HIGH SCHOOL CLASS OF 1972 
(NLS:72) 

National Longitudinal Study of the High School Class of 1972 (NLS: [two-digit beginning 
year]/[two-digit collection year]) 

National Longitudinal Study of the High School Class of 1972 (NLS:72/86) 

National Longitudinal Study of the High School Class of 1972, “First Follow-up” 
(NLS:72/73) 

National Longitudinal Study of the High School Class of 1972, “Fifth Follow-up” 
(NLS:72/86) 

(First, second, third, fourth, and fifth follow-up surveys were conducted in 1973, 1974, 

1976, 1979 and 1986, respectively.) 

NOTE: For longitudinal studies data, the follow-up reference is often left out because data are 
probably from multiple collections, so just the acronym for base year and last collection is used, 
e.g., 1990 Beginning Postsecondary Students Longitudinal Study (BPS: 1990/1994). 

16. EARLY CHILDHOOD LONGITUDINAL STUDY-BIRTH COHORT (ECLS-B) 

Early Childhood Longitudinal Study, Birth Cohort (ECLS-B), 

[of children born in calendar year 2001: collection at 9 mos., 24 mos., 30 mos., 48 mos., 
kindergarten entry, and first grade] 

Components: 

Children’s Birth Certificates 
Parent-Guardian Interviews 
Father Questionnaires 
Direct Child Assessments 
Early Care and Education Providers 
Teacher Questionnaires 
School Questionnaires 


17. EARLY CHILDHOOD LONGITUDINAL STUDY, KINDERGARTEN CLASS OF 
1998-99 

Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K), [season of 
collection] [four-digit year]. 

Waves: 

Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K), fall 1998 
Early Childhood Longitudinal Study, Kindergarten Class of 1998-99 (ECLS-K), fall 1998 
and spring 1999 
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Components: 

Student Assessments 
Parent-Guardian Interviews 
Teacher Questionnaires 
Teacher Ratings 

Special Education Teacher Questionnaires 

School Questionnaires 

Salary and Benefits Questionnaire 

Student Records Abstract Form 

Verification of Head Start Program Participation 


18. POSTSECONDARY EDUCATION QUICK INFORMATION SYSTEM 

Postsecondary Education Quick Information System (PEQIS), “[ Survey Title],” PE QIS [no.], 
[four-digit year]. 

Example: 

Postsecondary Education Quick Information System (PEQIS), “Survey on Students With 
Disabilities at Postsecondary Education Institutions,” 1998. 


19. NATIONAL ASSESSMENT OF EDUCATIONAL PROGRESS 

The National Assessment of Educational Progress (NAEP) [four-digit year(s)] [name of 
assessment] Assessment. 

Main National NAEP: 

Assessments: 

Mathematics 

Reading 

Science 

History 

Civics 

Arts 

Years — various years 1992 on 

Main State NAEP 

Assessments: 

Mathematics 

Reading 

Science 

History 

Years — various years 1992 on 

Trend NAEP 

National Assessment of Educational Progress (NAEP), [four-digit year] Trends in 
Academic Progress 

National Assessment of Educational Progress (NAEP), 1996 Trends in Academic Progress 
Assessments: 

Mathematics 
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Science 

Reading 

Writing 

Years — various years 1969 on 

20. NAEP HIGH SCHOOL TRANSCRIPT STUDIES 

Years — 1990, 1994, 1998, and 2000 

The 1998 High School Transcript Study (HSTS) 
The 2000 High School Transcript Study (HSTS) 


21. NATIONAL ADULT LITERACY SURVEY (NALS)— NCES AND DEPT. OF ED. 

National Adult Literacy Survey (NALS) 

Years — 1992 


22. NATIONAL ASSESSMENT OF ADULT LITERACY (NAAL) 

National Assessment of Adult Literacy (NAAL) 

Years — 2003 


23. TRENDS IN INTERNATIONAL MATHEMATICS AND SCIENCE STUDY (TIMSS) 

Trends in International Mathematics and Science Study (TIMSS), 1995 

Trends in International Mathematics and Science Study (TIMSS), 1999 
Trends in International Mathematics and Science Study (TIMSS), 2003 
Years — 1995, 1999, and 2003 

NOTE: TIMSS was formerly known as the Third International Mathematics and Science Study. If 
relevant, also include a note stating that in earlier reports, TIMSS 1999 is also referred to as 
TIMSS-R (TIMSS-Repeat). 

24. 1999 CIVIC EDUCATION STUDY (CivEd) 

Components: 

Student Questionnaire 
School Questionnaire 

25. PROGRAM FOR INTERNATIONAL STUDENT ASSESSMENT (PISA) 

Components: 

Assessment Items 
Student Questionnaire 
School Questionnaire 
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26. 2001 PROGRESS IN INTERNATIONAL READING LITERACY STUDY (PIRLS) 
Components: 

Reading Assessment 

Student, Teacher, and School Questionnaires 


27. ADULT LITERACY AND LIFESKILLS (ALL) 

28. PUBLIC LIBRARIES SURVEY (PLS) 

Public Libraries Survey (PLS), fiscal year [four-digit year] 

Years — 1989 through 2002, annual 


29. ACADEMIC LIBRARIES SURVEY (ALS) 

Academic Libraries Survey (ALS), [four-digit year] 

Years — 1966 through 1988, every three years; 1988 through 2002, every two years [up to 
1998, was part ofIPEDS (IPEDS-L); 2000 and beyond, not a part of IPEDS] 


30. STATE LIBRARY AGENCIES (STLA) SURVEY 

State Library Agencies (StLA) Survey, fiscal year [four-digit year] 

Years — 1994 through 2002, annual 


31. RECENT COLLEGE GRADUATES STUDY (RCG) 

Recent College Graduates Study (RCG), [four-digit year] 

Years— 1976, 1978, 1981, 1985, 1987, and 1991 

32. SCHOOL SURVEY ON CRIME AND SAFETY (SSOCS) 

School Survey on Crime and Safety (SSOCS), [four-digit year] 

Years — 2000, 2004 (expected) 

Example: 

U.S. Department of Education, National Center for Education Statistics, School Survey on 
Crime and Safety (SSOCS), 2000. 

NCES Joint Surveys With Non-NCES Entities 

SCHOOL CRIME SUPPLEMENT (SCS) TO THE NATIONAL CRIME VICTIMIZATION 
SURVEY— NCES AND DEPT. OF JUSTICE 

U.S. Department of Justice, Bureau of Justice Statistics, School Crime Supplement (SCS) to the 
National Crime Victimization Survey, January-June 1989, 1995, and 1999. 
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SURVEY OF EARNED DOCTORATES (SED) AWARDED IN THE UNITED STATES 

Survey of Earned Doctorates Awarded in the United States 


Old NCES Survey Names 

HIGHER EDUCATION GENERAL INFORMATION SURVEY (HEGIS) 

Precursor to IPEDS — collected data from 1965 to 1986 

Higher Education General Information Survey (HEGIS), “Fall Enrollment in Colleges and 
Universities” 


Non-NCES Surveys 


CENSUS SURVEYS 

U.S. Department of Commerce, Bureau of the Census, Current Population Survey (CPS) [“name 
of survey”], year [“ unpublished tabulations” if it applies]. 

U.S. Department of Commerce, Bureau of the Census, Current Population Survey (CPS), 
October 1999. 

U.S. Bureau of the Census, Current Population Survey (CPS), October 1994. 

Bureau of the Census: “Annual Survey of Government Finances: School Systems,” 1997. 

Basic CPS 

U.S. Census Bureau, Current Population Survey (CPS), November 1979 and 1989, and 
1997. 

U.S. Bureau of the Census, Current Population Survey (CPS), March 1971-98. 

U.S. Department of Commerce, Bureau of the Census, Current Population Survey (CPS), 
October 1997 

U.S. Department of Commerce, Bureau of the Census, Current Population Reports, “Voting 
and Registration in the Election of November” (various years), series P-20, Nos. 143, 

440, and 504. 

October Supplements: 

Computer Use (1984, 1989, 1993, and 1997) 

Private School Tuition (1979, 1985, 1988, 1991, 1994, and 1997) 

Selected Education Characteristics (1992 and 1995) 

Summer Activities (1996) 


BUREAU OF JUSTICE STATISTICS (BJS) 

National Crime Victimization Survey (NCVS), 1992-98 (annual) 
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